Wednesday, May 16, 2007

 

Save and Restore Files/Images to SQL Server Database

The usual consensus is that it is bad idea to store large files or images in a database, and it is recommended that you instead store the file system path to the image in the database and store each image as a separate file in a suitable folder structure. This helps to reduce the size of the database and keeps it performing efficiently. Despite the disadvantages, there are some advantages in storing images in the database such as keeping all data in a single place.

Storing a file system path is the preferred technique in many situations, but if you have to store the images (such as icons) in an SQL Server database, you should create a separate table containing the image/binary data column (and ideally place it in a different filegroup residing on a separate physical disk). Then reference the images from your main table using a foreign key. This will help to reduce wasted space and improve performance.

There are quite a few partial solutions and code snippets around to solve this problem, and I’ve helped answer this question a few times, both on and off the forums. So I decided to write an easy to reuse solution with simple semantics, and have the ability to automatically compress data going in and out of the database. [The download link is towards the end of this post].

The goal was to be able to simply store a file in the database and receive back an ID so it can be used to reference and retrieve the file:

// Create a FileStore object
DBFileStore dbf = new DBFileStore(connectionString, CompressMode.Compressed);
 
// Add file to DB
Stored storeID = dbf.Store(filename);
 
// …OR since returned Stored object is never null (but Id might be zero)
int fileID = dbf.Store(filename).Id;
 
//...
 
// Restore file from db to file system with original filename
string localfilename = dbf.Restore(fileID, @"C:\temp");


Originally, I had the Store() method just returning a plain int, but then decided that I wanted the GUID value returned as well. It should be easy to revert the code if you wish…

At the heart of the Store() methods is the following private method that actually does the write to the database:

private Stored Store(byte[] bytearray, int origLength, 
                     string name, int? parentID, int? refID, 
                     CompressMode compress)
{
    int fileID = 0;
    Guid guidID = new Guid();
    Stored retID = new Stored(fileID, guidID);
 
    // Execute stored proc to add data to repository
    using (SqlConnection conn = new SqlConnection(this.connectionString))
    {
        using (SqlCommand cmd = conn.CreateCommand())
        {
            cmd.CommandText = "AddToRepository";
            cmd.CommandType = CommandType.StoredProcedure;
            cmd.Parameters.AddWithValue("@filename", Path.GetFileName(name));
            cmd.Parameters.AddWithValue("@length", origLength);
            cmd.Parameters.AddWithValue("@compressedlength", bytearray.Length);
            cmd.Parameters.AddWithValue("@binarydata", bytearray);
            cmd.Parameters.AddWithValue("@refID", refID);
            cmd.Parameters.AddWithValue("@parentID", parentID);
            cmd.Parameters.AddWithValue("@iscompressed",
                (compress == CompressMode.Compressed) ? true : false);
 
            // Add output parameters
            SqlParameter p = cmd.Parameters.Add("@ID", SqlDbType.Int);
            p.Direction = ParameterDirection.Output;
            p = cmd.Parameters.Add("@GUID", SqlDbType.UniqueIdentifier);
            p.Direction = ParameterDirection.Output;
 
            // No retry logic; any Exceptions bubble up to caller...
            conn.Open();
            cmd.ExecuteNonQuery();
            conn.Close();
 
            // Retrieve output values
            fileID = (int)cmd.Parameters["@ID"].Value;
            guidID = (Guid)cmd.Parameters["@GUID"].Value;
 
            retID = new Stored(fileID, guidID);
        }
    }
 
    return retID;
}


Similarly, the corresponding method that decompresses and writes to the file system:

/// <summary>
/// Restores a saved file from Repository to file system.
/// </summary>
/// <param name="fileID">Integer ID of the file to restore</param>
/// <param name="filepath">Either the folder which the original file 
/// is to be restored to, OR the "full file path" + "\" + "new name"</param>
/// <returns>The full filename of the restored file/data</returns>
public string Restore(int fileID, string folderfilepath)
{
    string filename = String.Empty;
 
    // Execute stored proc to read data from repository
    using (SqlConnection conn = new SqlConnection(this.connectionString))
    {
        using (SqlCommand cmd = conn.CreateCommand())
        {
            cmd.CommandText = "LoadFromRepository";
            cmd.CommandType = CommandType.StoredProcedure;
            cmd.Parameters.AddWithValue("@ID", fileID);
 
            conn.Open();
            using (SqlDataReader rdr =
                       cmd.ExecuteReader(CommandBehavior.CloseConnection))
            {
                if (rdr.Read())
                {
                    filename = SaveToFileSystem(rdr, folderfilepath);
                }
            }
        }
    }
 
    return filename;
}
 
private string SaveToFileSystem(SqlDataReader rdr, string folderfilepath)
{
    byte[] binaryData = (byte[])rdr["binarydata"];
    string dbFilename = rdr["filename"].ToString();
    int length = (int)rdr["length"];
    bool isCompressed = (bool)rdr["iscompressed"];
    DateTime CreationTime;
    DateTime LastWriteTime;
 
    DateTime.TryParse(rdr["FileCreatedDate"].ToString(), out CreationTime);
    DateTime.TryParse(rdr["FileModifiedDate"].ToString(), out LastWriteTime);
 
    string filename;
    if (Directory.Exists(folderfilepath))
    {
        // Folder only passed: restore as folder + original name...
        filename = Path.Combine(folderfilepath, dbFilename);
    }
    else
    {
        // Caller has specified exact folder and filename we should restore as...
        filename = folderfilepath;
    }
 
    using (FileStream fs = new FileStream(filename, FileMode.CreateNew, FileAccess.Write))
    {
        if (isCompressed)
        {
            MemoryStream ms = new MemoryStream(binaryData);
            byte[] decompressedData = new byte[length];
            using (GZipStream cgz = new GZipStream(ms, CompressionMode.Decompress))
            {
                cgz.Read(decompressedData, 0, length);
            }
            fs.Write(decompressedData, 0, length);
        }
        else
        {
            fs.Write(binaryData, 0, length);
        }
    }
 
    // Set file creation/modified date and time if these exist...
    FileInfo fi = new FileInfo(filename);
    if (CreationTime != DateTime.MinValue)
    {
        fi.CreationTime = CreationTime;
    }
    if (LastWriteTime != DateTime.MinValue)
    {
        fi.LastWriteTime = LastWriteTime;
    }
 
    return filename;
}


In addition, I’ve created several Store() method overloads that allow a parent ID and a reference ID to be stored allowing grouping and linking of stored items respectively.

The solution provided here is 100% working, includes several unit tests and all the C# source code which is fairly well commented. Click on this link to download the zipped solution files. [Standard disclaimer applies: use at own risk, no liability or responsibilty taken for any loss or damage, what you do with the downloaded files, you do so at your own risk etc.]

If you have questions, comments, suggestions or bug fixes please leave a comment. Hope you find this useful.

Points to note:
  • You may want to change the database table names. In which case you’ll need to alter the T-SQL scripts and code.
  • There are a few places in the SQL scripts where I’ve defined and commented out alternatives between SQL Server 2000 and 2005. For instance: On SQL Server 2000, the column type used to store the binary data is defined as ‘image’, whereas on SQL Server 2005 the preferred type is ‘varbinary(max)’.
  • In SQL Server 2005 you can make use of the OUTPUT keyword instead of a separate T-SQL call to SCOPE_IDENTITY() to obtain the ID of the most recently added row.
  • The project’s T-SQL scripts do not contain functionality for setting up the tables on a separate filegroup. Talk to your DBA…
  • If you don’t need the GUID functionality, you can delete all code relating to this column and remove the column from the T-SQL scripts.
  • There are several unit tests included with the solution. To run the NUnit tests from within Visual Studio 2005, you will need to download and install TestDriven.NET
  • The code as it stands has a maximum limit of 2^31 -1 bytes for file sizes; if this isn’t enough then you should ask yourself whether the database is the best place to store those files!
  • All exceptions are bubbled up to caller. Retries of saving to DB are left up to caller.
  • Strongly-types datasets have NOT been used, and these are the prefered method when dealing with more complex data.
If you are interested, here is a list of possible enhancements.

Suggestion for Further Enhancements:
  • Version 3.0 (or is it 3.5?) of the .NET framework will have the ability to create compressed data that is directly compatible with being opened with standard Zip applications. This would mean having the option to restore a file from the repository in a compressed (zipped) state.
  • Create a method to return a response stream for ASP.NET file streaming (including ContentType such as “image/gif”).
  • Minimise memory usage (use buffering). Less explicit allocation of large byte[] arrays and taking advantage of ReadByte() and WriteByte() methods on streams.
  • Having the option to use the DeflateStream in preference to GzipStream.


    

Powered by Blogger