The Art of SQL Server FILESTREAM - Red Gate Software

DBA HandbooksThe Art ofSQL Server FILESTREAMA Quick Start Guide for developers and administratorsJacob Sebastian & Sven AeltermanISBN: 978-1-906434-88-5

The Art of SQL ServerFILESTREAMA Quick Start Guidefor developers and administratorsBy Jacob Sebastian and Sven AeltermanFirst published by Simple Talk Publishing June 2012

Copyright June 2012ISBN 978-1-906434-88-5The right of Jacob Sebastian and Sven Aelterman to be identified as the authors of this work has been assertedby them in accordance with the Copyright, Designs and Patents Act 1988.All rights reserved. No part of this publication may be reproduced, stored or introduced into a retrievalsystem, or transmitted, in any form, or by any means (electronic, mechanical, photocopying, recording orotherwise) without the prior written consent of the publisher. Any person who does any unauthorized act inrelation to this publication may be liable to criminal prosecution and civil claims for damages.This book is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold,hired out, or otherwise circulated without the publisher's prior consent in any form other than that in whichit is published and without a similar condition including this condition being imposed on the subsequentpublisher.Technical Review by Sven AeltermanCover Image by Andy MartinEdited by Tony Davis and Heather FieldingTypeset & Designed by Peter Woodhouse & Gower Associates

Table of ContentsIntroduction____________________________________________________19Chapter 1: Storing and Managing Unstructured Data_________________23Structured Data______________________________________________________ 24Unstructured Data_____________________________________________________27CLOB data_________________________________________________________28BLOB data_________________________________________________________29Storing Unstructured Data______________________________________________30Storing Large Objects in SQL Server____________________________________30Storing Large Objects in the file system__________________________________ 32Database or file system?_______________________________________________ 32Enter FILESTREAM____________________________________________________ 37A first look at FILESTREAM___________________________________________39When to use FILESTREAM___________________________________________ 44Summary____________________________________________________________ 46Chapter 2: Getting Started with FILESTREAM______________________ 48The FILESTREAM Data Container_______________________________________ 48Understanding the FILESTREAM Filegroup________________________________50Creating FILESTREAM-enabled Databases_________________________________ 53Creating a new FILESTREAM-enabled database___________________________ 53Enabling an existing database for FILESTREAM___________________________59Creating a Table with FILESTREAM Columns_____________________________ 60Using SSMS table designer to create a table with FILESTREAM columns______ 60Using T-SQL to create a table with FILESTREAM columns__________________ 61Using T-SQL to add FILESTREAM columns to an existing table______________63Converting VARBINARY(MAX) columns to FILESTREAM and vice versa______ 66Tables and FILESTREAM filegroups____________________________________ 68FILESTREAM filegroup queries_______________________________________ 69FILESTREAM Data Manipulation Using T-SQL____________________________ 70Inserting a row with FILESTREAM data_________________________________72

Updating and deleting FILESTREAM data________________________________78FILESTREAM and triggers____________________________________________79Advanced FILESTREAM DDL___________________________________________ 80The FILESTREAM data container and partitioned tables___________________ 80Creating a database with multiple FILESTREAM filegroups__________________83Disabling FILESTREAM Storage on a Database_____________________________87Summary_______________________________________________________________Chapter 3: Accessing FILESTREAM Data from Client Applications_____ 89What is Streaming?____________________________________________________ 89Understanding Streaming Access to FILESTREAM Data_____________________ 90Step 1: Starting a transaction___________________________________________91Step 2: Retrieving the logical path name and transaction context_____________92Step 3: Opening the FILESTREAM data file_______________________________95Step 4: Reading and writing FILESTREAM data__________________________ 98Step 5: Closing the FILESTREAM data file_______________________________ 101Step 6: Closing the transaction (COMMIT/ROLLBACK)___________________ 101Data Manipulation Using the Streaming APIs_____________________________102Inserting a new record into a FILESTREAM-enabled table__________________ 102Replacing the FILESTREAM data completely____________________________ 103Partial updates to FILESTREAM data___________________________________104Reading information from the FILESTREAM store________________________ 105Deleting the BLOB data stored in a FILESTREAM column_________________ 105Deleting a row from a FILESTREAM-enabled table_______________________ 105Lab 1: Reads, Writes and Partial Updates__________________________________106Handling multiple FILESTREAM columns and rows______________________ 114Understanding the Logical Path to a FILESTREAM Data File________________ 115Formatting PathName()______________________________________________ 119PathName() and ROWGUIDCOL______________________________________ 120SqlFileStream Class Reference__________________________________________ 123Instantiating a SqlFileStream object____________________________________ 123OpenSqlFilestream API Reference_______________________________________126Summary____________________________________________________________ 130

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQL______ 131Lab 2: FILESTREAM and Entity Framework_______________________________ 132Rebuilding the sample database_______________________________________ 133Lab 2a: A simple EF application using T-SQL access_______________________ 134Lab 2b: Using SqlFileStream and Entity Framework together________________ 144Lab 3: FILESTREAM and LINQ to SQL___________________________________ 158Creating a new console application project______________________________ 158Adding a LINQ to SQL class__________________________________________ 158Defining object relational mappings____________________________________ 159Adding a new row__________________________________________________ 162Querying FILESTREAM data using LINQ to SQL_________________________ 162Summary____________________________________________________________ 163Chapter 5: FILESTREAM with ASP.NET and Silverlight________________165Lab 4: Uploading Files to a FILESTREAM Database froman ASP.NET Web Page___________________________________________167Creating a new ASP.NET web application_______________________________ 167Adding a database connection string___________________________________ 168Designing the data entry page________________________________________169Writing the "code-behind"____________________________________________ 171Running the application_____________________________________________ 175Lab 5: Deploying and Configuring a FILESTREAM Web Application on IIS_____176Running Visual Studio under an administrator account____________________ 177Publishing the web application into a new virtual directory_________________ 177Configuring IIS to use SQL Server Integrated Security_____________________ 181Running the application_____________________________________________ 188Lab 6: Creating a Web Handler to Serve Images froma FILESTREAM Database_________________________________________189Creating a stored procedure to retrieve FILESTREAM data_________________189Creating the web page to serve images__________________________________190Running and testing the application___________________________________ 193Lab 7: Displaying Thumbnail Images of Items froma FILESTREAM Database on an ASP.NET Web Page___________________194

Creating the stored procedure to fetch item information___________________ 194Adding a new web form and data grid__________________________________196Adding a data source________________________________________________196Lab 8: Using SqlFileStream in an N-Tier Scenario_________________________ 204Creating the Visual Studio solution and projects__________________________205Writing the domain object code_______________________________________207Writing the data access code__________________________________________207Creating an HTTP handler to serve images_____________________________ 209Adding remote calling capability______________________________________ 211Testing the application from Visual Studio______________________________ 215Lab 9: Playing a Video Stored in a FILESTREAM Database ona Silverlight Application__________________________________________216Setting up the data_________________________________________________ 217Creating a Silverlight application______________________________________ 218Creating a video handler_____________________________________________ 219Adding and configuring MediaElement_________________________________ 222Running and testing the application___________________________________224Summary____________________________________________________________ 225Chapter 6: FILESTREAM with SSIS and SSRS_______________________227FILESTREAM and SSIS________________________________________________228Lab 10: Loading BLOB Values into a FILESTREAM Column Using SSIS________228Setting up the sample database________________________________________230Creating the SSIS package____________________________________________ 235Lab 11: Exporting BLOB Values from a FILESTREAM Column Using SSIS______256Creating the SSIS package____________________________________________ 256Setting the output folder_____________________________________________ 256The Execute SQL task_______________________________________________ 257The Foreach Loop container__________________________________________ 257The Script task_____________________________________________________ 258Lab 12: Displaying FILESTREAM Data in SSRS Reports____________________ 260Summary___________________________________________________________ 269

Chapter 7: FILESTREAM Database Administration___________________270FILESTREAM and Database Transaction Isolation Levels___________________ 271READ COMMITTED, REPEATABLE READ and SERIALIZABLEisolation levels_____________________________________________________ 271READ UNCOMMITTED isolation level________________________________272Snapshot isolation level______________________________________________ 273Summary of FILESTREAM behavior under transaction isolation levels_______ 275Detaching and Attaching FILESTREAM Databases_________________________276FILESTREAM and Garbage Files_________________________________________279Garbage files and recovery models_____________________________________279FILESTREAM garbage collection and tombstone tables___________________ 280FILESTREAM Data Corruption and DBCC Checks________________________ 284Corruption caused by missing FILESTREAM data files_____________________ 285Corruption caused by orphaned files___________________________________ 291Corruption caused by deleting the garbage files manually__________________292Querying FILESTREAM Databases______________________________________293FILESTREAM Data and Space Management______________________________ 294Changing the FILESTREAM filegroup location___________________________295Adding a new partition to share the FILESTREAM storage load____________ 299Migrating FILESTREAM Data__________________________________________306SSMS scripting_____________________________________________________306The Database Publishing Wizard______________________________________309Summary____________________________________________________________ 310Chapter 8: Backup and Restore for FILESTREAM Databases___________312Creating and Populating the Sample Database_____________________________ 313Backing up FILESTREAM Databases_____________________________________ 314Full backups_______________________________________________________ 315Differential backups________________________________________________ 315Transaction log backups_____________________________________________ 316Restoring FILESTREAM Databases______________________________________ 318Restoring from a full backup__________________________________________ 319Point-in-time restore________________________________________________ 319

File Backups and Piecemeal Restores_____________________________________ 322Data and Backup Compression for FILESTREAM Databases_________________ 325Beware of Garbage Files in FILESTREAM Backups_________________________326Summary____________________________________________________________328Chapter 9: Investigating FILESTREAM Databases__________________ 329Relevant System Catalog Views__________________________________________329sys.all_columns, sys.columns and sys.system_columns____________________ 330sys.computed_columns______________________________________________ 331sys.identity_columns________________________________________________ 331sys.dm_repl_traninfo_______________________________________________ 331sys.tables_________________________________________________________ 332sys.internal_tables__________________________________________________ 332sys.partitions______________________________________________________ 333sys.data_spaces____________________________________________________ 333sys.database_files___________________________________________________ 334sys.filegroups______________________________________________________ 334sys.system_internals_partition_columns________________________________ 334sys.system_internals_partitions_______________________________________ 335sys.filestream_tombstone_*__________________________________________ 335Useful FILESTREAM Metadata Queries__________________________________ 335SQL Server instance-level queries______________________________________ 336Database-level queries_______________________________________________340FILESTREAM-related DMVs____________________________________________346sys.dm_filestream_file_io_handles_____________________________________346sys.dm_filestream_file_io_requests____________________________________ 347sys.dm_tran_active_transactions______________________________________348FILESTREAM and Wait Types___________________________________________348FS_FC_RWLOCK__________________________________________________348FS_GARBAGE_COLLECTOR_SHUTDOWN___________________________349FS_HEADER_RWLOCK_____________________________________________349FS_LOGTRUNC_RWLOCK__________________________________________349FSA_FORCE_OWN_XACT__________________________________________349

FSAGENT_________________________________________________________ 350FSTR_CONFIG_MUTEX____________________________________________ 350FSTR_CONFIG_RWLOCK___________________________________________ 350Summary____________________________________________________________350Chapter 10: Integrating FILESTREAM with other SQL Server Features__351FILESTREAM and Replication__________________________________________ 352Replicating FILESTREAM columns____________________________________ 352Maximum replication data size________________________________________ 355FILESTREAM replication and the UNIQUEIDENTIFIER column____________ 357Managing the FILESTREAM replication flag using T-SQL__________________360Replication log reader and the FILESTREAM garbage collector______________ 365FILESTREAM and synchronization methods____________________________366Replicating databases with multiple FILESTREAM filegroups_______________ 367FILESTREAM and the different replication types_________________________368FILESTREAM and Log Shipping_________________________________________370FILESTREAM and Full-Text Indexing____________________________________ 371FILESTREAM and Database Snapshots___________________________________ 375FILESTREAM and Change Data Capture_________________________________ 377FILESTREAM and Data Compression____________________________________382FILESTREAM and Transparent Data Encryption___________________________382FILESTREAM and Database Mirroring___________________________________ 383FILESTREAM and High Availability Disaster Recovery______________________384Summary____________________________________________________________ 385Chapter 11: FileTable____________________________________________387Introduction to FileTable______________________________________________387FileTable Concepts____________________________________________________388Fixed schema______________________________________________________388Creating folder structures using the HIERARCHYID data type______________392Root folder________________________________________________________ 395Non-transactional access____________________________________________396FileTable namespace________________________________________________398FileTable security__________________________________________________ 400

Setting Up the Server for FileTable______________________________________ 402Creating a Database with Support for FileTable___________________________ 402Creating a FileTable__________________________________________________ 404Create a FileTable using T-SQL_______________________________________ 404Create a FileTable using SSMS_______________________________________ 406Work with a FileTable using SSMS____________________________________ 408Adding constraints to the FileTable___________________________________ 409Restrictions on creating FileTables_____________________________________410Accessing a FileTable with Windows Explorer and Other Client Applications___ 411Creating files______________________________________________________ 411Creating folders____________________________________________________ 413Memory-mapped files_______________________________________________ 413Empty database folder_______________________________________________ 414Programming FileTable________________________________________________414Adding rows using T-SQL____________________________________________ 415Using .NET file I/O APIs____________________________________________ 420Managing FileTables___________________________________________________423T-SQL functions___________________________________________________ 423Catalog and Dynamic Management Views_______________________________426Stored procedures__________________________________________________429Advanced FileTable Uses_______________________________________________430Full-text search____________________________________________________430Semantic search____________________________________________________ 432Investigating FileTable_________________________________________________ 435Summary____________________________________________________________ 435Chapter 12: Planning, Configuration and Best Practices______________437Planning____________________________________________________________437Optimizing your Storage Configuration for FILESTREAM___________________439Keep FILESTREAM data containers on a separate disk volume______________439Disabling short file names____________________________________________439Compressing FILESTREAM data______________________________________443Configuring NTFS cluster size_______________________________________ 444

Disabling the indexing service________________________________________445Configuring antivirus______________________________________________ 446Disabling the Last Access Time attribute________________________________447Configuring the disks with the correct RAID level_______________________ 448Regular disk defragmentation________________________________________ 449Setting up FILESTREAM for Remote Access______________________________ 449Configuring the client computer for remote access________________________450Configuring the client application for remote access______________________450Best Practices for FILESTREAM Development_____________________________ 451Use FILESTREAM streaming APIs to read and write FILESTREAM data______ 451Avoid small partial updates___________________________________________ 451Keep an additional column to store the type of file or extension_____________ 452Identifying the type of file from a file header_____________________________ 452Add a content-type column on FILESTREAM tables______________________454Add a timestamp or date column to track last modified datefor cache control___________________________________________________454Use a computed column to retrieve the file size__________________________ 455Keep a default constraint on FILESTREAM columns______________________456Summary____________________________________________________________457Appendix A: Configuring FILESTREAM on a SQL Server Instance______ 459Understanding FILESTREAM Configuration and Access Levels_______________461FILESTREAM access levels___________________________________________ 461Enabling and Configuring FILESTREAM During Installation_______________ 466Enable FILESTREAM for Transact-SQL access___________________________467Enable FILESTREAM for file I/O streaming access________________________467Allow remote clients to have streaming access to FILESTREAM data________ 468Enabling and Configuring FILESTREAM After Installation_________________ 469Configuring FILESTREAM at the Windows level________________________ 469Configuring FILESTREAM at the SQL Server instance level________________ 471Verifying FILESTREAM Configuration___________________________________475Verifying Windows-level FILESTREAM configuration settings______________ 475Verifying SQL Server Instance-level FILESTREAM configuration settings_____477

Using SSMS to check that FILESTREAM is enabled_______________________477Using T-SQL to check that FILESTREAM is enabled______________________478Disabling the FILESTREAM Feature____________________________________ 479Advanced Installation and Configuration Options_________________________481Unattended SQL Server installation and FILESTREAM configuration________ 481Enabling and configuring FILESTREAM from the command line____________483Configuring FILESTREAM access level from the command line____________ 484Setting up FILESTREAM on a failover cluster____________________________485Summary___________________________________________________________ 486

About the authorsJacob SebastianJacob Sebastian is a Microsoft MVP (SQL Server) from Ahmedabad, India, who has beenworking with SQL Server for almost 15 years. He works as CTO with Excellence Infonet,Ahmedabad, which focuses on building highly scalable, mission critical applications forthe healthcare industry.Jacob started his database career in the early nineties with Dbase and then moved toClipper, FoxPro and, finally, to SQL Server, starting with version 6.5. He has presentedat various SQL Server conferences, including PASS Global Summit, European PASSConference, 24 Hour PASS, Tech-ED India, Community Techdays, Microsoft VirtualTechdays, SQL Server Saturday and more.Jacob wrote his first book in 2009, The Art of XSD – SQL Server XML Schema Collections(http://brurl.com/fs21) which delved deeply into the XSD and XML Schema collections,introduced in SQL Server 2005. He also wrote the XML chapter for the book, SQL Server2008 Bible (http://brurl.com/fs22).When not working, Jacob spends time at beyondrelational.com.Jacob was the lead author on this book, and wrote Chapters 1 to 10.xiv

Sven AeltermanSven Aelterman is a lecturer in Information Systems at Troy University in Troy, Alabama.He teaches undergraduate courses in data warehousing, network infrastructure andsecurity. He is also the technology specialist for Troy University's Sorrell College ofBusiness, and performs a variety of technology roles with a global scope.He continues consulting work through Adduxis, where he assists customers withsoftware development projects using the Microsoft .NET Framework and relatedtechnologies. Sven focuses on application design and architecture and secure codingpractices. He received his MBA from Troy University and his bachelor's degree from theHogeschool Gent (University College, Ghent) in Belgium, his native country. Sven ismarried to Ebony and they have a son, Edward, and a daughter, Sofia.Sven contributed Chapter 11 to this book, and the labs in Chapters 4 and 5, as well as someadditional material throughout.xv

AcknowledgementsJacob SebastianThis is my third book and, as always, the first name I would like to mention is Steve Jonesat SQLServerCentral.com, who deserves the credit for igniting my passion for writing.Thereafter, I've benefited from the help of numerous mentors. Big thanks, in particular, toJeff Moden, Madhu Nair, Kent Waldrope, Arnie Rowland, and Michael Coles, all of whomplayed a significant role in shaping my attitude towards the SQL Server community.I would like to thank Tony Davis at Red Gate, who helped me to get this project startedand guided me throughout this long journey. Tony not only managed the project, butalso stepped in at every stage throughout the writing process, commenting, correcting,rephrasing and reorganizing to mold the content into the shape in which you see it inthis book.Sven Aelterman, technical reviewer and co-author of this book deserves major creditsince this book would not have been possible without his involvement and hard work.He wrote the entire chapter on FileTable and rewrote several sections of other chapters,including many of the code samples. His expertise in SQL Server, FILESTREAM andvarious Web/.NET technologies has played a crucial role in ensuring the accuracy ofinformation presented in the book.Several other people contributed, directly or indirectly, to this book throughout thejourney. Michael Rys and Srini Acharya at Microsoft were kind enough to answer manyof my various questions, as was Paul Randal at sqlskills.com. My friend, Sudeep Raj, dida wonderful internal review of the SSIS lab and provided some valuable comments. Manytimes during this journey, I needed to recharge and refocus on the goal of completing thebook. For helping me achieve this, I thank my friends Anil, Khyati, Pinal, Krupali, Tejas,Nakul, Vinod Kumar, and everyone else who helped motivate me along the way.xvi

Finding time for writing such a lengthy book has been a real challenge. I ended upstealing the weekends, holidays, early mornings and late night hours from the family. Idedicate this book to my wife Jincy and daughters Julie and Jiya. Without your support,encouragement and understanding, I would not have been able to write it.Finally, I would like to thank all the readers of my articles, blogs, and previous books, fortheir continuous feedback and encouragement.Sven AeltermanParticipating in the technical review and authoring of this book has been a uniqueexperience and one that would not have been possible without the efforts of manyothers. I am, first and foremost, grateful to Jacob for allowing me to share the authorshipof this work with him. Jacob is, without a doubt, a leading expert in SQL Server andFILESTREAM. Working with him has allowed me to learn so much, including the factthat I didn't yet know everything about FILESTREAM. A close second is reserved forthe efforts of our editor, Tony Davis. It was probably sufficient to ask him to coordinateJacob's writing and my reviews, but his role didn't stop there. Adding an additionalchapter and more content to the book doubtlessly caused him long nights. I appreciatehis experience and guidance in seeing us through this venture.As you're about to learn, SQL Server FILESTREAM is an exciting and intricate feature. Iwould like to claim that I was able to figure it all out on my own, but I would do injusticeto several people who helped clarify some of the finer points (where the documentationwasn't always helpful).First, Microsoft developer evangelist, Glen Gordon, was instrumental in connecting meto members of the SQL Server team, whom I'd like to thank for reading and respondingto my wordy questions. Finally, Paul Randal at SQLskills was also called upon for someverification.xvii

While the help of these professionals was vital to the accuracy and completeness of thebook, I am equally thankful to my wife, Ebony, for her support during the late nights ofresearching, editing, and writing. She was very much aware of the existence of our twochildren. I am not so sure if she was aware of the existence of her husband anymore.(Honey, he was the one sleeping in late while you were raising the kids.)xviii

IntroductionWhen I first started my database career in the early 1990s, all the applications on whichI worked dealt with simple relational data, stored in a few relational tables. The mostcommonly used data types were strings, numbers, dates, and Boolean. Many of these weresingle-user applications that lived on a single computer.Soon, however, and rapidly, applications started becoming more complex, and new andmore challenging data processing requirements began to emerge. Many applicationsstarted dealing with unstructured data. Along with relational data, our applicationsneeded to be able to generate, process and serve non-relational data such as XML, images,music, videos, and so on.SQL Server 6.5 offered two data types for storing large chunks of text and binary data,namely text and image respectively. SQL Server 7.0 added one more data type,ntext, which enabled storage of Unicode text. SQL Server 2005 added the data typesVARCHAR(MAX), NVARCHAR(MAX) and VARBINARY(MAX), which made reading andwriting chunks of unstructured data much simpler. Despite these improvements, thebasic underlying problem remained: SQL Server is primarily a relational databasemanagement system and optimized for processing relational data. Reading and writinglarge chunks of text/binary data within the database requires a lot of SQL Server internalmemory (buffers as big as the size of the chunk being read/written), and multiple,simultaneous requests for large blocks of data can put the server under severememory pressure.The workaround adopted by most database architects was to store the unstructured dataas disk files on the file system, and store only the path to the file in the relational table.The file system is optimized for reading and writing large files, and offers streamingcapabilities. This solved the performance problems, but introduced new challenges. Inthis scheme, we had no way to ensure transactional consistency between relational dataand data in the file system. Also, management of backups was complicated, since thenormal database backups were only possible if accompanied by the corresponding backupof the file system.19

When SQL Server 2008 introduced the FILESTREAM feature, it had my immediateattention. At last, we had a way to store the unstructured data in the file system andthus take advantage of the NTFS streaming capabilities but, at the same time, ensuretransactional consistency between the relational data and the file system data. Inaddition, a normal database backup would now also include the file system data.It's safe to say I was an early adopter, and my primary goal with this book is to ease thepaths of others who are interested in this technology, as some aspects of its adoptionproved far from plain sailing.This book attempts to guide you, step-by-step, through every phase of FILESTREAMimplementation, from enabling the feature, to creating FILESTREAM tables, to manipulatingFILESTREAM data through the streaming APIs, to handling FILESTREAM data insimple ASP.NET web pages, to complex N-tier .NET applications.It also covers, in detail, administration and troubleshooting of FILESTREAM databasesand tables. Alongside the advantages offered by the technology, it tries to coversome of its shortcomings, and to offer practical advice on workarounds, or possiblealternative solutions.Who is this book for?The primary audience for this book is the SQL Server developers who have to deal withmore than relational data. It will help them to get started with FILESTREAM-enableddatabases, and it provides a very detailed coverage on accessing FILESTREAM datathrough the .NET and Win32 APIs. We have provided a reasonable number of examples,sample code, and labs that demonstrate how to access FILESTREAM data from C#.NET,VB.NET and C++. In addition, there are several labs demonstrating FILESTREAM accessfrom common application platforms/tools such as ASP.NET, Silverlight, SSIS, SSRS (SQLServer Reporting Services), Entity Framework, LINQ to SQL and so on.20

The secondary audience for this book are any SQL Server administrators and technicalproject managers on projects requiring the storage and retrieval of BLOB (Binary LargeObjects) data. The second half of the book is dedicated to FILESTREAM database administration-relatedtopics covering almost everything you might need to know when dealingwith FILESTREAM-enabled databases.How this book is organizedChapters 1–2 of the book provide an overview of the FILESTREAM feature and enoughinformation to enable the technology and then start creating a FILESTREAM-enableddatabase, tables and columns, and manipulating FILESTREAM data with T-SQL.Chapters 3–6 explain how to access FILESTREAM data from client applications, throughthe streaming APIs, targeting the interests of database developers. Topics covered includehow to:• access FILESTREAM data in a transactionally consistent manner using C#.NET(with equivalent code examples provided in C++ and VB.NET)• access FILESTREAM data with Entity Framework and LINQ to SQL• deploy and configure FILESTREAM Web Applications with IIS (InternetInformation Services)• handle FILESTREAM data in N-tier .NET applications.Chapters 7–10 cover topics relating to the administration of FILESTREAM-enableddatabases, such as backup and restore, migrating FILESTREAM databases and data, andinvestigating FILESTREAM databases, using Dynamic Management Views (DMVs).21

Chapter 11 takes an in-depth look at an extension of FILESTREAM, called FileTable, whichwas introduced in SQL Server 2012. A FileTable can store the data on an entire NTFSfolder hierarchy. Unlike a regular FILESTREAM column, client applications can directlyaccess the files in a FileTable.Chapter 12 provides a summary of some of best practices to observe when using theFILESTREAM feature.Finally, there is an appendix providing advice on how to set up, review and troubleshootFILESTREAM configuration-related problems.Code examplesThroughout this book are scripts demonstrating use of the FILESTREAM feature. All thecode you need to try out the examples in this book can be obtained from the URL below.http://www.simple-talk.com/RedGateBooks/SebastianAelterman_ArtOfSQL-ServerFILESTREAM_Code.zip.All examples should run on all versions and editions of SQL Server from SQL Server 2008upwards, unless specified otherwise. You must have .NET Framework 3.5 SP 1 or laterto run the .NET examples. Some of the .NET examples use the .NET 4.0 methodStream.CopyTo(), and you will need either .NET 4.0 for those examples, or to replacethis method with equivalent code in .NET 3.5 (as explained with the relevant codelistings). To run the labs, you will need the associated environment, such as SilverlightSDK (Software Development Kit) for the Silverlight Lab, SQL Server Reporting Servicesfor the SSRS Lab, SQL Server Integration Services for the SSIS Lab, and so on. Chapter 11,on FileTable, requires SQL Server 2012.22

Chapter 1: Storing and ManagingUnstructured DataOne of the challenges of designing data storage of SQL Server-driven applications isthat not all data fits into the relational model. The relational model expects each tableto describe a discrete entity and each column in the table to define exactly one aspect ofthat entity. Much business data is less structured in nature; it is becoming increasinglycommon to need to store images, documents, and speech data (PDF, MP3, and so on)alongside the relational data.It's been possible to store this sort of unstructured data in SQL Server for a number ofyears, but there have always been associated difficulties. With older, large object (LOB)data types, such as TEXT, NTEXT, and IMAGE, the data was generally held out of row, i.e.on a different page from the rest of the row data. Reading and modifying the data, usingT-SQL functions such as READTEXT and WRITETEXT was painfully slow. The introductionof the large-value (MAX) data types, namely VARCHAR(MAX), NVARCHAR(MAX)and VARBINARY(MAX) improved matters somewhat and made it easier to develop applicationsthat used these LOB data types. However, it was still preferable to store theseLOBs on the file system, with only a link to the data stored in the database, in order toavoid the buffer pool pollution that arose from having such data stored inline, and alsoto take advantage of high-performance streaming, lower rates of fragmentation duringmodifications, and so on, that are offered by file system storage.Nevertheless, storing LOBs on the file system, separately from the rest of the businessdata, was not without problems, including the fact that the data wasn't included indatabase backups and, perhaps most significantly, having the data outside the transactionalcontrol of the database meant that it was possible for the LOB data, if it weresubject to modification, to become out of sync with the structured, SQL Server-baseddata to which it related. In addition, when LOB data is stored in the file system, outsidethe control of the database administrator, any user or application with sufficient23

Chapter 1: Storing and Managing Unstructured DataWindows permissions can access the LOB data files directly. This makes it a considerablechallenge to ensure that the LOB data files are accessible only to the Windows users whohave access to the associated relational data.Enter FILESTREAM storage, introduced with SQL Server 2008, which allows us toappend a FILESTREAM attribute to a VARBINARY(MAX) data type, and so get the performanceadvantages of storing LOB data on the file system, combined with the manyadvantages of having that data under the control of SQL Server. In this chapter, we willbegin our exploration of the storage of LOB data, using the new FILESTREAM attribute,covering:• what we mean by structured, unstructured and semi-structured data• advantages and disadvantages of storing LOB data in the database and file system• what FILESTREAM offers, a first look at how it works, and strategic considerationsfor its use.Structured DataAs database professionals, we deal with data every day; we store, manage, process,and query large volumes of data as part of our day-to-day database development andmanagement work. Most of the data that we handle may be regarded as structured,meaning that it conforms fully to the formats and data models associated with therelational database. By way of its compliance with the relational data model, such datagenerally also conforms to more formal definitions of the term, structured data, whichrefer to:• information that has been organized to allow identification and separation of thecontext of the information from its content• data organized into semantic chunks or entities, with similar entities grouped togetherin relations or classes, and presented in a patterned manner24

Chapter 1: Storing and Managing Unstructured Data• data that can be shared electronically with customers and suppliers, because thestructure and meaning of data has been standardized and usually determined bya data model.All of these definitions point to the defining characteristic of structured data: the abilityto identify and extract specific pieces of information that is included in the data. This isusually done by adding some sort of schema or data model which explains the structureof the data and makes it meaningful. It means we can query the data based on predetermineddata types and well-defined and understood relationships.Of course, there are degrees of structure; data that one application would deemunstructured, might be deemed structured enough by another. To demonstratewhat I mean, consider the data presented in Listing 1-1, which is a log generated byan invoicing application.Row Details--- -----------------------------------------------------------01 Mike created invoice 101 dated 2010/01/13 with a noteto notify James before shipping on fax number 456-098-090902 John created invoice 102 dated 2010/01/14 with a note tocall Anna on 898-090-0909 after shipment is sentListing 1-1:Unstructured log data.If asked whether this data is structured or unstructured, most would say "unstructured,"and with good justification; many pieces of vital information, stored in each row as a longtext string. It is comprehensible to a human, but will be hard for software to extract thosepieces of information from the text string.So essentially, yes, this data is unstructured but even so, if this data is used just as a logof activities and the application does not need to query those vital pieces of informationembedded in the text string, and extract them separately, then for that business this datais structured enough. Of course, a second application might need direct access to the25

Chapter 1: Storing and Managing Unstructured Datainvoice date, currently embedded within the string stored in the Details column. Fromthis application's perspective, the data really is unstructured, and we'd need to store theinformation in a slightly more structured manner, as shown in Listing 1-2.Inv# SalesRep InvDate InvNote---- -------- ---------- -------------------------------------------------------101 Mike 2010/01/13 Notify James before shipping on fax number 456-098-0909102 John 2010/01/14 Call Anna on 898-090-0909 after shipment is sentListing 1-2:Slightly more structured log data.A third application, one that requires access to all of the individual elements in the data,would require the data to be more structured still, in fact in the sort of form we are accustomedto seeing in typical relational database tables, as shown in Listing 1-3.Inv# SalesRep InvDate Notify Number type when---- -------- ---------- ------ ------------ ------ ---------------101 Mike 2010/01/13 James 456-098-0909 Fax before shipment102 John 2010/01/14 Anna 898-090-0909 Phone after shipmentListing 1-3:Structured log data.I recently encountered a perfect example of this subtlety in a real application. It was adata integration project for a health care system, where the source system stored thepatient address in a single field, and so the address column in the source system heldvalues such as "999-888, Kings Highway, Brooklyn, NY 11223." From the perspective of thissystem, the data was structured enough, since the address information was used only formailing purpose. A mailing module picked the information from the address field andprinted the mailing label. No other part of the application touched the address column.The target system, however, was more advanced and needed to display the locationof patients on a map, and so needed to extract individual elements of the address, suchas street name, zip code, and so on, in order to calculate the latitude and longitude of26

Chapter 1: Storing and Managing Unstructured Dataeach patient. Because the target system found the data to be unstructured, we had tobuild a module that parsed the source data and produced more structured output for thetarget application.From here on, however, we are simply going to define "structured data" in general tomean data that meets the requirements of the relational model, such that we can extractthe required information from the data easily and accurately.Semi-structured dataSemi-structured data is a form of structured data that does not fully follow the rules and structuresassociated with tables and data models of relational databases. However, they do include tags andmarkups to separate and identify semantic elements and collections of records and fields within the data.The most common example of semi-structured data is XML. An XML document contains hierarchicaldata organized as elements and attributes which does not fall within the relational data model. However,a database engine such as SQL Server can read the markup information and extract values out of anXML document, and shred the data into one or more result sets. We won't cover semi-structured data anyfurther in this book.Unstructured DataFrom the previous section, it is clear that data can be considered to be in an unstructuredform when it is hard to programmatically retrieve the information at the desiredlevel of granularity. Such data if often not associated with a schema, or has only a verysimple schema, and typical examples include images and videos. Following on from ourdefinition of structured data, we can simply state that unstructured data is any data thatdoes not meet the requirements of the relational model.Due to the complex nature of today's business requirements, it is quite natural that applicationsneed to handle unstructured, as well as structured, data. For example, not all the27

Chapter 1: Storing and Managing Unstructured Datainput data is generated by manual data entry nowadays. The input data may come froma variety of sources and, as discussed, data may be structured enough for the applicationthat produces the data, but may be unstructured in terms of the requirements of theapplication that consumes it.In the language of the relational database, this sort of unstructured data is a Large Object(LOB), with Character Large Objects (CLOBs) for storing plain text data, and Binary LargeObjects (BLOBs) for binary data.CLOB dataMany applications deal with some sort of CLOB data. For example:• an email management application that stores the Base64 encoded versionof the attachments• a discussion forum that stores the questions and answers discussed in theforum threads• a content management system (CMS) that stores the content of web pages(e.g. articles) in a relational table• an invoicing application that stores the notes or delivery instructions.This sort of data does have a defined structure; for example, the input controls for aforum application, which stores the questions and answers, generate data in a certainmarkup format (e.g. HTML or RTF). When the data is to be displayed, the applicationreads the data, analyzes the structure, and can recognize certain elements of the data thatrequire specific formatting, such as URLs.28

Chapter 1: Storing and Managing Unstructured DataHowever, in the context of our definition of structured data, this CLOB data is unstructured.It is not easy for the relational database engine to understand the content storedin the CLOB column. For example, if a user wants to retrieve a list of all URLs in a set offorum posts, this is not going to be easy for a relational database query.BLOB dataBLOBs consist of "chunks" of binary data. When you record a voice note using yourpersonal organizer software, you are generating some BLOB data. A chunk of encryptedbinary data, or the content of an executable file, is a BLOB value. More and more applicationstoday need to handle and store BLOB data. A few examples include:• a document management system that stores different types of documents, such asWord or Excel• a product catalog application that stores the images of products• a multimedia library that stores videos and music• a software download website that stores setup files, executable or zip files.Again, most BLOB data – Word or Excel documents, images, audio files, and so on – doeshave a defined structure; an application that generates or consumes these files can understandthe content based on the associated, simple schema.However, from the relational database perspective, BLOB data is most certainly unstructured.Even though the applications that read and write BLOB data from and into thedatabase understand the structure of the data, a database engine does not really understandit.29

Chapter 1: Storing and Managing Unstructured DataStoring Unstructured DataAs discussed, most business applications need to store, manage and process unstructureddata of various shapes and sizes. Choosing the right storage design for this data will becritical to the performance, manageability, and overall health of the application and itsdatabase. There are two basic choices: store the LOB in the database, or store it on the filesystem with a pointer to it from within the database. So, depending upon the nature ofthe data, usage pattern, workload and various other factors related to the specific applicationand business requirements, the designers might decide whether to keep the LOBdata within the database or outside the database.Storing Large Objects in SQL ServerMost relational database management systems provide data types that can store LOBs.One can simply create a table with columns of the appropriate data types and use theprogramming model provided to read, write, and update the LOB data.Pre-SQL Server 2005, unstructured data was stored in LOB data types. CLOB data wasstored in TEXT (and NTEXT for Unicode) data types. BLOB data was stored in the IMAGEdata type. These data types could store up to 2 GB of data, which was stored separatelyfrom the relational table data.These old LOB data types, while still available, are deprecated in SQL Server 2005and later, in favor of the large-value data types: VARCHAR(MAX) for text data (andNVARCHAR(MAX) for Unicode character data) and VARBINARY(MAX) for BLOB data.Working with these new data types is much easier than working with their older counterparts.For example, reading from, or writing into, a VARCHAR(MAX) column is mucheasier than with a TEXT column.30

Chapter 1: Storing and Managing Unstructured Data-- Create a table and insert a rowCREATE TABLE BlobTable(TextValue TEXT ,VarcharMaxValue VARCHAR(MAX))INSERT INTO BlobTable( TextValue ,VarcharMaxValue)SELECT 'The text data type is invalid for local variables.' ,'However VARCHAR(MAX) is valid'-- Reading the first 20 characters from the TEXT columnDECLARE @val VARBINARY(16)SELECT @val = TEXTPTR(TextValue)FROM BlobTableREADTEXT BlobTable.TextValue @val 0 20-- Reading the first 20 characters from the VARCHAR(MAX) columnSELECT LEFT(VarcharMaxValue, 20)FROM BlobTableListing 1-4:Working with the new LOB data types is much easier.All of the new LOB data types introduced in SQL Server 2005 support storage of up to2 GB of data in a single column.VARBINARY(10) -- Can store up to 10 BytesVARBINARY(8000) -- Can store up to 8000 BytesVARBINARY(8001) -- Error!!!VARBINARY(MAX) -- Can store up to 2 GB of dataListing 1-5:MAX identifier raises the storage capacity from 8,000 bytes to 2 GB.31

Chapter 1: Storing and Managing Unstructured DataStoring Large Objects in the file systemMany applications that deal with LOB data store this data outside the database, in thefile system. Under this approach, a Products table in SQL Server would simply havean additional VARCHAR column (called, for example, ImagePath), which stores the filesystem path to the image file associated with the current product, the image being storedas an individual disk file in a designated folder.An application that needs to retrieve the image will first read the relational data, obtainthe path to the image file, and grab the image file using file system APIs. When a newproduct is added, the image file is copied to the specified folder and the path to the newimage file is inserted into the ImagePath column. When an image needs to be updated,either the existing image file is overwritten, or a new file is copied and the path is updatedin the table.Database or file system?Architects of applications that work with LOBs need to decide whether to store thisdata in a column within the database or as files in the file system. To a large degree, thisdecision will depend on the exact nature of the data, and the business application thatuses it. Fundamentally, the size of a single LOB in SQL Server 2000 and 2005 is limitedto 2 GB. If your application needs to store, for example, high-definition movies or X-rayfilms, then you may well exceed this limit, and have little choice but to store them on thefile system, where the limit, for a NTFS file system, is closer to 16 TB (unless you cansplit the data into multiple 2 GB columns which, in many cases, would be impossible,or very painful).Another common consideration would be the coupling of the data to the application andto other data. For an application like Flickr, the database is actually just a repository forloosely-coupled BLOBs. It probably makes more sense to store them on the file system.32

Chapter 1: Storing and Managing Unstructured DataConversely, if an application needs to store a PDF version of every quote sent to acustomer, then there is tighter coupling and it may make more sense to store the LOBs inthe database.Regardless of the route that seems most appropriate for your application, there are advantagesand disadvantages to each approach, based on considerations of:• transactional consistency• data manageability and recoverability• security• performance.We'll briefly discuss these issues in the coming sections, and further details can befound in Paul Randal's white paper, FILESTREAM Storage in SQL Server 2008(http://brurl.com/fs1).Transactional consistencyOne of the fundamental advantages of storing data in a relational database is that, viathe implementation of various transaction isolation levels, locking mechanisms, and soon, the relational engine guarantees a transactionally-consistent view of the data, at anypoint in time. In other words, it guarantees that transactions within SQL Server conformto the ACID properties, and so prevents interference between transactions on the samedata, and ensures that transactional operations can succeed or fail together, as a batch.By storing LOBs within the database, we can ensure that this transactional consistencyextends to the LOB data as well as other relational data within the database. For example,assume that a user uploads a document (a BLOB) into a document management applicationthat uses SQL Server storage. The operation involves inserting a new row into atable along with the relational data (such as the document name) and the BLOB data (the33

Chapter 1: Storing and Managing Unstructured Datacontent of the document). Operations on the BLOB column fall within the scope of anyopen transactions, and a COMMIT or ROLLBACK affects the changes to the BLOB columnjust like any other columns that are involved the current batch of operations (COMMITwill save the changes and ROLLBACK will discard the changes).Conversely, when LOB data is stored outside, SQL Server cannot ensure transactionalconsistency between the relational data in the tables and the data in the file system. It ishard, for example, to implement an architecture that rolls back file system changes whena SQL Server transaction rolls back. This can result in situations where the relational dataand LOB data get out of sync. It is common, over time, to end up with "orphaned" filesthat are not referenced from the relational data and, conversely, references to files that nolonger exist.Data manageability and recoverabilityFrom a data management perspective, there are several reasons why it is a lot easier tohave all the data, including LOB data, stored in the database.Firstly, it means that when a full backup of the database is taken it will include the LOBs,which makes restoring the data to a consistent state in the event of a disaster a mucheasier task. Also, it eases the process of moving the data from one location to another.Since it is quite easy to take differential backups and transaction log backups of databases,we can (at least in theory) perform point-in-time restores of all data in the database,including the LOB data. For example, in cases of accidental data loss, the DBA can returnthe database to a previous state, at a point in time just before the operation took placethat erroneously removed the data.When storing the LOBs in the file system, and only the path to the disk file in a relationaltable, the backups do not contain the LOB data, and DBAs need to take separate backupsof the database and file system. When the database is to be restored on another location,the correct version of the file system backup should also be restored. This requires careful34

Chapter 1: Storing and Managing Unstructured Dataversioning to make sure the right file system data is restored with the right databasebackup. It is also important to make sure to keep the same file system structure inthe new location so that the disk files maintain the same path as that stored in therelational tables.When the LOB data is stored outside the database, the relational data can be restored toa given point in time, but not the file system changes, since the file system on its owndoes not track modifications and does not allow you to roll back the changes, once thefile is saved.SecurityWhen LOB data is stored within the database, the security can be managed at the SQLServer level. This reduces the administrative complexity and aligns LOB security with thesame security processes applied to the associated relational data.If LOB data is stored outside the database, separate measures must be put in place tosecure the file system. The relational data and LOB data are disconnected (or connectedwith just a weak link which stores the path to the file in a relational column), andmanaging user access permissions is a major challenge.PerformanceYou're probably thinking that, on almost all levels, storing LOB data in the database ispreferable to storing it in the file system. However, despite the numerous advantagesdiscussed so far, of storing LOB data in the database, performance is the one considerationthat makes it a non-feasible option for most applications.The relational database engine is optimized to process relational data; storing andprocessing huge chunks of LOB data is not always done with the same ease.35

Chapter 1: Storing and Managing Unstructured DataOne of the big advantages that the file system can offer, when handling LOB data, isstreaming. The streaming of data refers to transferring it in a manner that allows itto be processed as the data is transferred, rather than collecting all the data before theprocessing begins. Streaming significantly increases the performance of read and writeoperations on that data.SQL Server cannot stream LOB data. When a chunk of LOB data is written, SQL Servercreates a memory buffer large enough to hold the data, accepts all the data, and only thenstarts writing it into the database. Similarly, when LOB data is read, SQL Server createsa memory buffer large enough to hold the LOB value, reads all the data into the bufferand then sends it to the client connection that requested the data. So, to read or write a500 MB LOB value from or to a column, SQL Server needs to allocate 500 MB of memorybuffer to read or write the value.Exceptionally large objects, or multiple simultaneous requests to read or write LOBdata, can cause severe memory pressure and possibly result in the dreaded "insufficientmemory" error. Keep in mind that SQL Server is optimized for I/O operations involving8 KB pages, not massive objects. In any scenario, such queries can slow down otherqueries and affect overall server performance.A piece of Microsoft research which studied the performance differences between storingLOB data in the database and in the file system found that file system is the preferredoption for storing LOB data if the size is 1 MB or above.To BLOB or not to BLOB?You can find the Microsoft Research Paper, one of Jim Gray's last projects before he disappeared at sea, ToBLOB or Not To BLOB: Large Object Storage in a Database or a File System at http://brurl.com/fs2.36

Chapter 1: Storing and Managing Unstructured DataEnter FILESTREAMBy now it should be very clear that storing LOB data in the database offers a number ofbenefits, but introduces performance challenges. Conversely, storing the LOB data in thefile system has overriding performance advantages, but fails to offer some of the basicdata integrity, security, and manageability features that are required for business data,and which SQL Server provides.Up to now, most people have adopted file system storage by necessity, and often struggledto overcome the associated shortcomings. This is exactly where the new FILESTREAMfeature fits in. FILESTREAM storage, introduced in SQL Server 2008, is implementedas an extension to the VARBINARY(MAX) data type. In other words, SQL Server implementsFILESTREAM storage through VARBINARY(MAX) columns with an additional flagindicating that the BLOB data should be stored in the FILESTREAM data container. Itallows BLOBs to be stored in a special folder on the NTFS file system, while bringing thatdata under the transactional control of SQL Server. When a VARBINARY(MAX) column ismarked as a FILESTREAM column, the 2 GB storage limit SQL Server puts on large-valuetypes is no longer applicable.This new FILESTREAM attribute is only available on the VARBINARY(MAX) data type.It cannot be applied to the VARCHAR(MAX) data type. If you have large plain text valuesto store, you can opt to store them in a VARBINARY(MAX) FILESTREAM column anduse the CAST( AS VARCHAR(MAX)) method, if you need access to the plaintext data in a query. When accessing a CLOB stored in a FILESTREAM column usingthe FILESTREAM APIs (SqlFileStream managed class or the Win32 function,OpenSqlFileStream), you can use functions available in your programming languageof choice to convert the byte stream to plain text. For this reason, from this point on, we'lluse the generic term "BLOB" to refer to all forms of LOB data.Note that, to access FILESTREAM data through the streaming API, you must connect toSQL Server using Integrated Security (Windows authentication). If you connect to SQL37

Chapter 1: Storing and Managing Unstructured DataServer using SQL Server authentication, you will get an "Access Denied" error when youtry to open the FILESTREAM data file.By applying the FILESTREAM attribute to a VARBINARY(MAX) column, all data in theresulting FILESTREAM column is actually stored in a file system folder, known as theFILESTREAM data container, which SQL Server specially configures for FILESTREAMdata storage.The intent is to provide a "best of both worlds" solution, offering the advantages below.• Excellent NTFS data streaming performance – Makes reading and writing the BLOBdata much faster.• Faster client access methods – Client applications can request access to theFILESTREAM data files directly using the Win32 API or the .NET Objects exposedby SQL Server. FILESTREAM data can still be modified using T-SQL methods (whichis not a recommended method), but the file system API methods allow much fasterread/write access to the FILESTREAM data.• Reduced overhead on SQL Server – When accessing the FILESTREAM data throughthe Win32 API or .NET Classes, FILESTREAM operations no longer overload SQLServer's internal memory.• Transactional consistency – BLOB data operations happen within the context ofa transaction so, if the transaction rolls back, the relational data modifications andBLOB changes will be rolled back.• Significantly easier backup management – Though the BLOB data is still stored inthe file system, a database backup includes the FILESTREAM data as well as all therelational data. When the backup is restored, the file system data is also restored to thenew location. This topic, along with a few caveats, will be covered in detail in Chapter8, Backup and Restore for FILESTREAM Databases.38

Chapter 1: Storing and Managing Unstructured Data• Point-in-time restores – With FILESTREAM enabled, changes to the BLOB data in thefile system are captured in the transaction log, and the log can be backed up, enablingDBAs to restore all the data to a state as of a specific date and time.• Improved security management – Permissions can be granted or denied on aFILESTREAM column just as for any other SQL Server columns.There are also a few limitations arising from the use of FILESTREAM. For example,database mirroring does not support FILESTREAM data; Transparent Data Encryption(TDE) does not work with FILESTREAM data; and database snapshots cannot be createdfor FILESTREAM filegroups. These limitations are discussed further in Chapter 10, and onBooks Online (http://brurl.com/fs3).A first look at FILESTREAMChapter 2 goes into much greater depth on how FILESTREAM works, how to create newdatabases and tables that can store FILESTREAM data, how to insert, update, and deleteFILESTREAM data, and so on.However, to whet your appetite, we'll walk through a very simple example here of how tostart using FILESTREAM on an existing SQL Server instance, covering:• enabling the FILESTREAM feature for a given SQL Server instance• setting the access level for the FILESTREAM data• adding FILESTREAM storage to an existing database• adding a FILESTREAM column to an existing table• inserting BLOB data into the FILESTREAM column.39

Chapter 1: Storing and Managing Unstructured DataFILESTREAM is a hybrid feature that requires both the Windows Administrator and theSQL Server Administrator to perform actions before the feature is enabled. Because ofthis, the configuration is often a two-step process.The Windows Administrator manages the Windows-related configuration changesneeded for the FILESTREAM feature. For example, Windows Administrator privilegesare required to create a Windows file share for the FILESTREAM feature. The WindowsAdministrator uses the SQL Server Configuration Manager to enable, disable, andconfigure FILESTREAM.The SQL Server Administrator manages the configuration changes within the boundariesof the given SQL Server instance. The SQL Administrator enables, disables, andconfigures the FILESTREAM Access Level, either by using SQL Server ManagementStudio or by running sp_configure.Appendix A describes in great detail how to enable and configure the FILESTREAMfeature, both during a SQL Server installation and for existing instances and databases. Ifyou encounter any difficulties, or need further information, please refer there.Enabling FILESTREAMOur first step is to enable the FILESTREAM feature at the instance level. Briefly,on the machine on which the SQL Server 2008 instance is installed, navigate asfollows (this path may change in future versions): Start | All Programs | Microsoft SQLServer 2008 | Configuration Tools | SQL Server Configuration Manager, then selectSQL Server Services. Right-click on the appropriate instance, select Properties thenmove to the FILESTREAM tab. In here, you can enable the instance for T-SQL and fileI/O streaming access.40

Setting the access levelChapter 1: Storing and Managing Unstructured DataAll the rest of the work can be done from within SQL Server Management Studio (SSMS),so open it up and connect to the instance you just FILESTREAM enabled. Since we areenabling FILESTREAM access on an instance that did not have FILESTREAM enabledas part of the SQL Server installation process, we will also need to set the FILESTREAMAccess Level, as demonstrated in Listing 1-6.EXEC sp_configure filestream_access_level, 2;GORECONFIGUREGOListing 1-6:Setting the FILESTREAM access level for the instance.Here, we've enabled both T-SQL and file I/O streaming access for the instance. Alternatively,by setting the filestream_access_level to 1, we could have enabled T-SQLaccess only (a value of 0 means disabled).This same process can be performed through the SSMS GUI (see Appendix A).Adding FILESTREAM storageNext, we need to add a FILESTREAM filegroup and FILESTREAM file (data container) tothe NorthPole database, as shown in Listing 1-7.ALTER DATABASE NorthPoleADD FILEGROUP NorthPole_fsCONTAINS FILESTREAM;GOALTER DATABASE NorthPoleADD FILE (NAME = NorthPole_fs,41

Chapter 1: Storing and Managing Unstructured DataFILENAME = 'C:\ArtOfFS\Demos\Chapter1\NorthPole_fs')TO FILEGROUP NorthPoleFS;GOListing 1-7:Adding a FILESTREAM filegroup and data container.Note that the C:\ArtOfFS\Demos\Chapter1 directory must exist on the appropriateSQL Server instance machine (see Appendix A for more discussion of NTFS file systemconfiguration).Adding a FILESTREAM columnHaving done this, we can now create a table in this database that has a FILESTREAMcolumn, as shown in Listing 1-8.CREATE TABLE [dbo].[Authors]([AuthorID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[AuthorName] VARCHAR(50) ,[AuthorImage] VARBINARY(MAX) FILESTREAMNULL)Listing 1-8:Creating an Authors table with a FILESTREAM column.For reasons that will be discussed in much more detail in Chapter 3, a FILESTREAMcolumn can be created only on a table that has a UNIQUEIDENTIFIER column with theROWGUIDCOL attribute. Notice, also, that the table contains an AuthorImage column oftype VARBINARY(MAX)to which we've applied the FILESTREAM attribute.42

Chapter 1: Storing and Managing Unstructured DataInserting BLOB dataIn Listing 1-9, we insert an image of the famous Simple-Talk author, Phil Factor, into thiscolumn, and the image will be automatically stored in the NorthPole_fs data container,on the file system, specified in Listing 1-7.-- Insert the data to the tableINSERT INTO [dbo].[Authors]( AuthorID ,AuthorName ,AuthorImage)SELECT NEWID() ,'PhilFactor' ,bulkcolumnFROM OPENROWSET(BULK 'C:\temp\philFactor.gif', SINGLE_BLOB) AS xSELECT *FROM dbo.authorsAuthorID AuthorName AuthorImage------------------------------------ -------------------- -------------C0B1E727-40B3-48FE-B83A-2FD6C1261695 PhilFactor0x474946383761C8001B01F7000004...(1 row(s) affected)Listing 1-9:Inserting an image into the Authors table.Notice that we use the NEWID() function to assign a new unique value to our AuthorIDcolumn. We use the OPENROWSET function, with the BULK rowset provider, to read in ourimage file. The SINGLE_BLOB argument specifies that the file is returned as a single-row,single-column rowset, called bulkcolumn, of type VARBINARY(MAX). We then simplyinsert this image into the AuthorImage column.43

Chapter 1: Storing and Managing Unstructured DataWe now have an image, stored in the file system, but under the transactional control ofSQL Server, and which would be included in database backups, and so on. We are alsonow ready to take advantage of the performance benefits of streaming and fast file systemAPI access methods when reading and modifying this data, as we will see in Chapter 2.When to use FILESTREAMDespite the obvious advantages of FILESTREAM to SQL Server applications that useBLOB data, the decision of where and how to use it should be taken only after carefullyanalyzing the storage requirements, data usage patterns, possible administrativeoverheads, and so on.Size of the filesThe previously-referenced Microsoft Research paper, To BLOB or Not To BLOB: LargeObject Storage in a Database or a File System, suggests that, for BLOBs that are smaller than256 KB, the best performance will probably be obtained by simply storing the data directlyin the relational tables. Files larger than 1 MB are better stored in the file system, andmanaged through SQL Server using FILESTREAM. In that in-between region of 256 KBto 1 MB, the decision of which way to go can only be taken after analyzing the specificdata usage scenarios.Generally speaking, though, if the average size of the BLOB values that you expectyour application to process is below 1 MB, then direct storage in the database, using anormal VARBINARY(MAX) column, is likely to perform better in most cases. However,this may vary based on how write-intensive the workload is. Even 256 KB files that areread/updated many times per second in a VARBINARY(MAX) column will cause memoryexhaustion. The best way to take this decision is by running some tests that involve realworkload and usage patterns, and see what fits best in your specific environment.44

Data usage patternsChapter 1: Storing and Managing Unstructured DataFILESTREAM does not support in-place (also referred to as "partial") updates of BLOBdata files. Instead, every time a BLOB value is modified, SQL Server creates an entirelynew copy of the file with the modified content, and discards the old file. The old file willbe deleted by an asynchronous garbage collector process.If your business requirement includes making frequent changes to the BLOB data values,then FILESTREAM may not be the right option. However, this decision should be takenonly after running some tests with the expected usage pattern and deciding whether to gowith FILESTREAM or VARBINARY(MAX) storage.Administrative overheadThough FILESTREAM offers a number of benefits to the database administrators, theremay be cases when it adds unacceptable administrative and storage overhead.As discussed earlier, when you take a full database backup of the FILESTREAM-enableddatabase, all the FILESTREAM data will, by default, come along with your backup. Whenyou restore the backup to a different location, the FILESTREAM data will be restoredas well.This is great from a management perspective but, if you have terabytes of FILESTREAMdata, this will add a huge storage overhead to your backup files, and greatly increase thetime and server resources required to perform the backups. For example, let's say youhave a 10 GB database and 20 Terabytes of FILESTREAM data. Your SLA requires a dailyfull backup of the database to be taken, but the FILESTREAM data is seldom modified.You will be adding massively to the size and overhead of your daily full backups, withmost of the space taken up by FILESTREAM data that hasn't changed since last time.45

Chapter 1: Storing and Managing Unstructured DataIn such cases, it probably makes sense to exploit the fact that FILESTREAM data isstored in a separate, dedicated filegroup, and that SQL Server supports partial backups,whereby only selected filegroups are backed up. We could take backups excludingthe FILESTREAM filegroups, and then less-frequent backups that include only theFILESTREAM data.In short, it is recommended that you analyze the workload and the type of data to beprocessed, and undertake some performance tests before deciding whether or not to useFILESTREAM.SummaryAs business applications become ever more versatile, the smooth handling of unstructured,LOB data, alongside the normal (and normalized) relational data is becoming anincreasingly common requirement.Storing such data in the database (in a VARBINARY(MAX) column) offers a number ofadvantages, including transactional consistency, point-in-time recovery, manageabilityand better security administration. However, these benefits come with a large memoryoverhead and, often, a severe performance penalty. Microsoft research shows that the filesystem is the preferred storage medium for files over 1 MB in size.Storing BLOB data in the file system offers performance benefits, taking advantage ofthe streaming capabilities of NTFS, but it does not provide transactional consistency,point-in-time recovery, security, and so on, and this is a big drawback for many businessapplications.FILESTREAM is a new storage feature introduced in SQL Server 2008. It stores the BLOBdata in the file system and allows you to take advantage of the NTFS stream capabilities.At the same time, FILESTREAM ensures transactional consistency between the relationaldata and BLOB data stored in the file system.46

Chapter 1: Storing and Managing Unstructured DataFurthermore, administering and safeguarding the data is much simpler, since the BLOBdata is included in normal database backups, and any modifications to that data areincluded in the transaction log file, so point-in-time recovery is supported.Any BLOB data that is under 1 MB is probably still best stored in a VARBINARY(MAX)column, directly in the database. However, for all other BLOB data, FILESTREAM storageis the way to go.47

Chapter 2: Getting Started withFILESTREAMAs we discussed in the previous chapter, FILESTREAM enables you to store BLOB data inthe file system so that you can take advantage of the streaming and caching capabilitiesof NTFS. By accessing the BLOB data using the Win32 streaming APIs exposed by SQLServer 2008, you can offload the processing overhead from SQL Server.In this chapter, we look in detail at how FILESTREAM data is stored, and how you set upa FILESTREAM database, table, and column. We then provide an example of insertingdata into the column.The FILESTREAM Data ContainerSQL Server stores the FILESTREAM data in a file system folder which is known as theFILESTREAM data container. Each FILESTREAM-enabled database must have its ownunique data container. The location of the FILESTREAM data container is specified whenyou create a FILESTREAM-enabled database or when you enable FILESTREAM on anexisting database.SQL Server creates a number of subfolders in the data container to manage theFILESTREAM data; a folder is created for each table, with one or more FILESTREAMcolumns. If a table is partitioned, a folder is created for each partition. Within the folderfor each table or partition, a subfolder is created for each of the FILESTREAM-enabledcolumns, to store the BLOB data.48

Chapter 2: Getting Started with FILESTREAMWhen SQL Server configures a file system folder as the FILESTREAM data container,it creates a file named filestream.hdr, and a folder named $FSLOG. Both of these filesystem objects are used by SQL Server internally and it is strongly recommended that youdo not access or modify them outside SQL Server.The filestream.hdr file stores the internal information that SQL Server uses to managethe BLOB data stored in the FILESTREAM data container, and to associate it with therelational data stored in the database tables. The $FSLOG folder is used to keep track ofthe changes applied on the BLOB data stored in the FILESTREAM data container, similarto a transaction log in a SQL Server database.It is important to remember that two databases cannot share the same file systemlocation to store their FILESTREAM data. It is not permitted even to use a subfolderwithin the FILESTREAM data container of one database to store the FILESTREAM datacontainer of another database.For example, assume that you have a FILESTREAM-enabled database named NorthPolewith a FILESTREAM data container located in C:\FS\NorthPole. Table 2-1 showssome examples of valid and invalid FILESTREAM data container locations for anotherFILESTREAM-enabled database called Southpole, assuming the NorthPole databasealready exists.FILESTREAM Data ContainerC:\FS\SouthPoleC:\FS\NorthPoleC:\FS\NorthPole\SouthPoleCommentsValidInvalid: C:\FS\NorthPole is already used by theSouthPole database.Invalid: C:\FS\NorthPole is already used by theNorthPole database; therefore the data containercannot be created as a subfolder.Table 2-1:Valid and invalid FILESTREAM data container locations.49

Chapter 2: Getting Started with FILESTREAMNote that SQL Server does not prevent you from storing your own data in theFILESTREAM data container; for example, you can create additional files and folders in aFILESTREAM data container. However, you should not touch the folders and files createdby SQL Server directly; only access the FILESTREAM data container using the APIsprovided by SQL Server.Understanding the FILESTREAM FilegroupA SQL Server database requires a minimum of two filegroups to store its data: the datafilegroup and the log filegroup. Most real-world databases have more than one datafilegroup, usually placed on different disk drives to improve the I/O performance.To store FILESTREAM data, a database must have a new type of filegroup known as aFILESTREAM filegroup. A separate filegroup is needed because the FILESTREAM BLOBdata is not stored within the database; it is stored in the FILESTREAM data container. Soa FILESTREAM-enabled database needs to have at least three filegroups:• a data filegroup to store relational data• a log filegroup to store transactional logs• a FILESTREAM filegroup to store FILESTREAM data.As with the data filegroups, you might decide to have more than one FILESTREAMfilegroup in a database, to distribute the FILESTREAM I/O load into multiple diskvolumes. In such cases, one of those filegroups must be marked as the default filegroup.When you create a table, you can choose the filegroup in which you want SQL Serverto store the FILESTREAM data (described later in the section Creating a database withmultiple FILESTREAM filegroups).50

Chapter 2: Getting Started with FILESTREAMTo further understand the FILESTREAM filegroup, let's take a look at the minimal T-SQLscript that creates a FILESTREAM-enabled database (Listing 2-1)./*01*/ CREATE DATABASE NorthPole ON/*02*/ PRIMARY (/*03*/ NAME = NorthPole,/*04*/ FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole.mdf'/*05*/ ), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(/*06*/ NAME = NorthPole_fs,/*07*/ FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole_fs')/*08*/ LOG ON (/*09*/ NAME = NorthPole_log,/*10*/ FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole_log.ldf')Listing 2-1:Sample T-SQL script to create a FILESTREAM-enabled database.Lines 05, 06, and 07 specify the configuration parameters for the FILESTREAM filegroup.A closer look will highlight a few differences between a FILESTREAM filegroup definitionand the definition of a data or log filegroup.First of all, the FILESTREAM filegroup uses the additional clause CONTAINSFILESTREAM which tells SQL Server to treat the filegroup as a FILESTREAM filegroup.This is important, because SQL Server needs to do some additional work to configure theFILESTREAM filegroup.Next, notice the value of the FILENAME parameter in line 07; while the FILENAMEparameters of data files and log files point to a disk file, for a FILESTREAM filegroup itpoints to a folder. The FILESTREAM data will be stored in multiple files, with one file foreach data value. Therefore, it makes sense that the FILESTREAM filegroup points to afolder, rather than a single file.When you create a FILESTREAM-enabled database, SQL Server will create and configurethe folder specified as the FILESTREAM data container. It is very important to ensurethat the leaf-level folder within the path that you provide does not already exist, so thatSQL Server can go ahead and create it. Please also note that SQL Server will create only51

Chapter 2: Getting Started with FILESTREAMthe leaf-level folder and assume that the rest of the folder structure already exists. Theoperation will fail if the leaf-level folder exists, or if top level folders do not exist, or if thefolder structure specified is within the FILESTREAM data container of another database.In the above example, C:\ArtOfFS\Demos\Chapter2\NorthPole_fs is specified asthe FILESTREAM data container. Therefore, the folder structure C:\ArtOfFS\Demos\Chapter2 should already exist at the time of running the above script. If this folderstructure does not exist, the operation will fail with the error:The path specified by 'C:\ArtOfFS\Demos\Chapter2\NorthPole_fs' is not in a valid directory.It is not possible to create a FILESTREAM data container as a child folder of anotherFILESTREAM data container. If you attempt, for example, to create a FILESTREAM datacontainer called FS2 within an existing FILESTREAM data container, SQL Server will failthe operation with the error:The path specified by 'C:\ArtOfFS\Demos\Chapter2\Northpole_fs\FS2' cannot be used for aFILESTREAM container since it is contained in another FILESTREAM container.The folder NorthPole_fs should not exist within C:\ArtOfFS\Demos\Chapter2because SQL Server needs to create it. If the folder already exists, the operation will failwith the error:Cannot create file 'C:\ArtOfFS\Demos\Chapter2\NorthPole_fs' because it already exists. Changethe file path or the file name, and retry the operation.It may be interesting to examine how SQL Server identifies whether or not theFILESTREAM data container specified is within the FILESTREAM container of anotherdatabase. It appears that SQL Server looks for the existence of the FILESTREAMheader file in any of the upper level folders. The name of the FILESTREAM header fileis filestream.hdr. SQL Server creates this file within the FILESTREAM data containerof every FILESTREAM-enabled database when the database is created, and keeps anexclusive lock on it as long as the database is running.52

Chapter 2: Getting Started with FILESTREAMWhat is very interesting to know is that SQL Server only checks for the filename andnot the data or a signature inside the file. So, if you create a file named filestream.hdr and place it in one of the folders, SQL Server thinks that it is a FILESTREAM datacontainer and will not allow you to configure any of its subfolders as a FILESTREAM datacontainer. Filegroups are discussed in more detail later in this chapter, in the Tables andFILESTREAM filegroups section.Creating FILESTREAM-enabled DatabasesA FILESTREAM-enabled database is one that has one or more FILESTREAM filegroups.Such a database can be used to create relational tables that contain FILESTREAMcolumns. BLOB data that you insert into a FILESTREAM column is stored in theFILESTREAM data container within the specified FILESTREAM filegroup.You can enable FILESTREAM when you create a database, or you can enable it on anexisting database. However, before you do either, you must ensure that the FILESTREAMfeature is enabled and correctly configured at the SQL Server instance level. Detailedinformation about how to do this is provided in Appendix A: Configuring FILESTREAM ona SQL Server Instance.Note that the configuration demonstrated in this chapter, when creating a newFILESTREAM-enabled database, will get you started, but is rather basic and probably notsuitable for a production deployment. In Chapter 7, we discuss how to adapt filegrouparchitectures for very large FILESTREAM databases. In this chapter, we also don'tconsider any aspects of the configuration of the underlying file system; see Chapter 12 fora discussion of this topic.Creating a new FILESTREAM-enabled databaseOnce you have ensured that the SQL Server instance is FILESTREAM enabled, you cancreate a new FILESTREAM-enabled database. It is possible to do this using either SSMSor T-SQL.53

Chapter 2: Getting Started with FILESTREAMUsing SSMS to create a new FILESTREAM-enabled databaseWhen creating a FILESTREAM-enabled database using SSMS, you create the databasein the usual way, but then there are two additional steps: creating a FILESTREAMfilegroup for the new database, and adding a FILESTREAM data file that points to theFILESTREAM filegroup.You create the filegroup using the Filegroups page of the New Database dialog box. Youwill see a Filestream section in the lower half of the page, where you can create one ormore FILESTREAM filegroups for the new database being created, as shown in Figure 2-1.Figure 2-1:Adding FILESTREAM filegroups to a database.54

Chapter 2: Getting Started with FILESTREAMNext, you add a new file to the database to point to the FILESTREAM filegroup. You dothis on the General tab. The File Type of this file should be set to the Filestream Data filetype and should point to a valid FILESTREAM filegroup, as shown in Figure 2-2.Note that under SQL Server 2008 and 2008 R2, only one FILESTREAM data file can beadded to a FILESTREAM filegroup. This limitation has been removed in SQL Server 2012which allows creating more than one FILESTREAM data file per FILESTREAM filegroup.Figure 2-2:Adding a Filestream Data-type file to a database.55

Chapter 2: Getting Started with FILESTREAMWhen you select Filestream Data, the Filegroup column is automatically filled withthe name of the default FILESTREAM filegroup specified in the Filegroups page. Ifyour database has more than one FILESTREAM filegroup and you want to change thefilegroup associated with a newly-added FILESTREAM data file, you can do it by selectinga different filegroup from the drop-down control. However, this can be done only at thetime of adding a new FILESTREAM data file into the database. Once the changes aresaved, it cannot be modified further.There is something more to note here. When adding a FILESTREAM data file to thedatabase, the Path column should point to the folder in which the FILESTREAM datacontainer for the database will be created, as shown in Figure 2-3. Note that you specifythe parent folder of the data container in Path. SSMS creates a subfolder in the parentfolder for the FILESTREAM data container, with the same name as the FILESTREAMlogical name of the data file.Figure 2-3:Specifying the FILESTREAM data file path.56

Chapter 2: Getting Started with FILESTREAMIn this example, a new folder named NorthPole_fs will be created in C:\ArtOfFS\Demos\Chapter2. SQL Server will configure C:\ArtOfFS\Demos\Chapter2\NorthPole_fs as the FILESTREAM data container (see The FILESTREAM DataContainer, earlier in this chapter).When using SSMS to create a FILESTREAM-enabled database, the name of theFILESTREAM data container folder will always be the logical name of the filegroup. Whencreating a FILESTREAM-enabled database using T-SQL, you can explicitly specify a foldername which can be different from the logical filegroup name.Using T-SQL to create a FILESTREAM-enabled databaseIn the previous section we saw how to create a FILESTREAM-enabled database usingSSMS. We will now do the same using T-SQL, so let's look in more detail at the minimumscript for creating a FILESTREAM-enabled database.-- If the 'NorthPole' exists, drop itIF DB_ID('NorthPole') IS NOT NULL BEGINDROP DATABASE NorthPole;END-- Create NorthPole databaseCREATE DATABASE NorthPole ONPRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole_fs' )LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole_log.ldf')Listing 2-2:T-SQL code to create a FILESTREAM-enabled database.57

Chapter 2: Getting Started with FILESTREAMThe script is the same one we would use to create a regular database, except for thefollowing part that creates an additional filegroup to store the FILESTREAM data.FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole_fs' )Listing 2-3:Code snippet to show the definition of a FILESTREAM filegroup.The first line in the above listing creates a FILEGROUP named NorthPole_fs. Thesyntax is the same as creating or adding a regular data or log filegroup to a database. Theclause CONTAINS FILESTREAM is new in SQL Server 2008; it specifies that the filegroupis to be added as a FILESTREAM filegroup. The next line, NAME = NorthPole_fs,specifies the logical name of the filegroup.The FILENAME parameter in the third line specifies the file system location to beconfigured as the FILESTREAM data container. When creating a new FILESTREAMenableddatabase using T-SQL, you must provide the complete path to the FILESTREAMdata container (C:\ArtOfFS\Demos\Chapter2\NorthPole_fs in the above example).The parent folder (C:\ArtOfFS\Demos\Chapter2) must exist at the time of executingthe T-SQL script; however, the last subfolder in the path (NorthPole_fs) must not exist– if the folder already exists, the T-SQL script will fail, as described previously. Remember,you must not tamper with the FILESTREAM data container folder. Any change you makedirectly to this folder could corrupt your database.Note also the following, with regard to a FILESTREAM filegroup:• it cannot be named PRIMARY• SIZE, MAXSIZE, and FILEGROWTH attributes are inapplicable• it cannot point to a UNC path; it must refer to a path on the local server,a storage area network or an iSCSI target• it can be placed on a compressed disk volume.58

Chapter 2: Getting Started with FILESTREAMEnabling an existing database for FILESTREAMWe have seen how to add FILESTREAM storage capabilities when creating a newdatabase. Next, we will examine the process of adding FILESTREAM capabilities to anexisting database.When the FILESTREAM feature is enabled at the SQL Server instance level, the next stepis to check whether the database already has a FILESTREAM filegroup, by running thequery in Listing 2-4.IF EXISTS ( SELECT *FROM sys.filegroupsWHERE type = 'FD' )BEGINPRINT 'FILESTREAM Filegroup Exists!'ENDELSEBEGINPRINT 'FILESTREAM Filegroup does not exist!'ENDListing 2-4:Checking for a default FILESTREAM filegroup on the database.You can then use T-SQL in Listing 2-5 to add a new FILESTREAM filegroup tothe database.ALTER DATABASE NorthPoleADD FILEGROUP NorthPole_fsCONTAINS FILESTREAM ;GOListing 2-5:Adding a new FILESTREAM filegroup to a database.59

Chapter 2: Getting Started with FILESTREAMOnce the FILESTREAM filegroup has been added to the database, you can add aFILESTREAM file (which points to the FILESTREAM data container) using the T-SQLcommand in Listing 2-6.ALTER DATABASE NorthPoleADD FILE (NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole_fs')TO FILEGROUP NorthPole_fs ;GOListing 2-6:Adding a FILESTREAM data container to a database.The database is now FILESTREAM enabled. We can go ahead and start creating tablesthat contain FILESTREAM data.Creating a Table with FILESTREAM ColumnsOnce we have a FILESTREAM-enabled database, we are ready to go ahead and createtables containing FILESTREAM columns. A FILESTREAM column is a VARBINARY (MAX)column that has the FILESTREAM attribute enabled.Using SSMS table designer to create a table withFILESTREAM columnsYou will be surprised if you try to create a table with FILESTREAM columns using SSMStable designer in SQL Server 2008 and SQL Server 2008 R2 – it appears that the SSMSteam missed this feature, as there is no way to do this in the latest version at the time ofwriting (SQL Server 2012). You can create a table with a VARBINARY (MAX) column, butthere is no way to set the FILESTREAM flag on the column.60

Chapter 2: Getting Started with FILESTREAMTherefore, the recommended method for creating tables with FILESTREAM columnsis to write the CREATE TABLE statements using T-SQL (as described in the followingsection). Alternatively, you can do this in a two-step process: create all columns exceptthe FILESTREAM columns using the table designer, and then use T-SQL ALTER TABLEstatements to create the FILESTREAM column (see Using T-SQL to add FILESTREAMcolumns to an existing table, later in this chapter).Using T-SQL to create a table with FILESTREAMcolumnsThe script in Listing 2-7 creates a table with a FILESTREAM column. The FILESTREAMcolumn is created by adding the FILESTREAM attribute to the VARBINARY(MAX) columncalled ItemImage. Only VARBINARY(MAX) columns can be used for FILESTREAMstorage.CREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL)Listing 2-7:Creating a table with a FILESTREAM column.Notice that FILESTREAM is not a data type; it is an attribute that you can set on aVARBINARY(MAX) column. Setting this attribute adds FILESTREAM storage capabilitiesto the column.61

Chapter 2: Getting Started with FILESTREAMThe ItemID column in the above listing is very significant. A FILESTREAM columncan be created only on a table that has a UNIQUEIDENTIFIER column marked asROWGUIDCOL. The limitations for the ROWGUIDCOL are as follows:• cannot allow NULL• should have either the UNIQUE attribute or be the entire primary key• needs to be UNIQUEIDENTIFIER data type• cannot be modified or dropped as long as FILESTREAM columns exist in the table.SQL Server will create a subfolder in the FILESTREAM data container for each table thatcontains a FILESTREAM column. Within each table subfolder, another subfolder willbe created for each FILESTREAM-enabled column in the table. If the table has multiplepartitions, a folder will be created for each partition.Listing 2-8 shows the structure of the FILESTREAM data container after creating adatabase and table using the example scripts in this chapter (the folder names have beenshortened to simplify the listing). The folder for the table and the FILESTREAM columnare always GUID values, so these folder names will be different on your computer if youhave followed the example.C:\|-ArtOfFS\|-Demos\|-Chapter2\|-NorthPole_fs\-- FILESTREAM Data Container|-$FSLOG\-- Log Folder|-filestream.hdr -- Metadata File|-6488cba4…6eaad3\ -- Root folder for "Items" table|-7761f220…b204f1\ -- Folder for "ItemImage" columnListing 2-8:Example FILESTREAM data container structure.62

Chapter 2: Getting Started with FILESTREAMNote that the file locations in which SQL Server stores the BLOB data are irrelevant to us,because no direct access to these files is recommended. Any read/write operation to theBLOB data should be done either through T-SQL (in which you access the BLOB data asif it is a VARBINARY(MAX) column in the table) or through the Win32 API (in which youaccess the BLOB data using a logical identifier). This chapter provides examples usingT-SQL, and the Win32 API is discussed further in Chapter 3: Accessing FILESTREAM Datafrom Client Applications.Using T-SQL to add FILESTREAM columns to anexisting tableAdding a FILESTREAM column to an existing table can be simple or complex, dependingupon the current schema of the table. In the simplest case, the ALTER TABLE commandcould be as shown in Listing 2-9.ALTER TABLE [dbo].[Items]ADD ItemImage VARBINARY(MAX) FILESTREAM NULLGOListing 2-9:ALTER TABLE T-SQL script to add a FILESTREAM column to an existing table.This script adds a new FILESTREAM column named ItemImage to the Items table.For the script to execute successfully, the table must already have a UNIQUEIDENTIFIERcolumn that meets all requirements outlined above.In most cases, you can look into the table schema to verify whether or not the tablealready has a UNIQUEIDENTIFIER column with the required characteristics. The easiestway may be to right-click on the table in SSMS and generate the CREATE script for thetable. You can see the current attributes of all the columns in the script output.63

Chapter 2: Getting Started with FILESTREAMThere may be times when you are required to check the table schema programmatically.You can query the system metadata tables to see if a table has all the required characteristicsto have a FILESTREAM column. Querying is_rowguidcol in sys.columns willtell you whether the column is marked as ROWGUIDCOL. Similarly, the is_nullablecolumn can be queried to see if the column allows NULL values. Additional information,such as whether the column has a unique constraint or single column primarykey, can be retrieved from sys.key_constraints, sys.indexes and sys.index_columns. Listing 2-10 shows the T-SQL code to find out whether the Items table has aROWGUIDCOL with the required attributes to have FILESTREAM columns.IF EXISTS ( SELECT *FROM sys.key_constraints scINNER JOIN sys.indexes siON sc.unique_index_id = si.index_idAND si.object_id = OBJECT_ID('items')INNER JOIN ( SELECT * ,COUNT(*) OVER ( PARTITION BY index_id,object_id ) AS ColCount/* number of columns in the index */FROM sys.index_columns) ic ON ic.index_id = si.index_idAND ic.object_id = si.object_idINNER JOIN sys.columns cON c.column_id = ic.column_idAND c.object_id = ic.object_idAND c.is_rowguidcol = 1 -- ROWGUIDCOLAND c.is_nullable = 0 -- NOT NULLWHERE is_unique_constraint = 1 -- UNIQUE constraintOR ( is_primary_key = 1AND ColCount = 1) -- SINGLE COLUMN Primary Key)BEGINPRINT 'Table can have FILESTREAM columns'ENDELSEBEGINPRINT 'Table is not ready to have FILESTREAM columns'ENDListing 2-10:Checking whether a table is FILESTREAM ready.64

Chapter 2: Getting Started with FILESTREAMIf the table already has a UNIQUEIDENTIFIER column with all the required characteristics,you can simply go ahead and add the FILESTREAM column using the ALTERTABLE script in Listing 2-9. However, if the table does not have a UNIQUEIDENTIFIERcolumn then you need to add one, with all the required attributes. Listing 2-11 shows howto do this.ALTER TABLE [dbo].[Items]ADD ItemID UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUEGOListing 2-11:Adding a UNIQUEIDENTIFIER column with all the required attributes.A third possible outcome is that the table already has a UNIQUEIDENTIFIER column, butit does not have all the required attributes. For example, the column may allow NULLS,or it may not be marked as a ROWGUIDCOL, and so on. Depending upon your designrequirements, you could either add a new column with the required attributes, as shownin Listing 2-11, or you could change the existing column, as shown in Listing 2-12; note,though, that a table can have only one ROWGUIDCOL.-- Create a table with a UNIQUEIDENTIFIER ColumnCREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50))-- Alter the UNIQUEIDENTIFIER column to NOT NULLALTER TABLE ItemsALTER COLUMN ItemID UNIQUEIDENTIFIER NOT NULL-- Add a UNIQUE ConstraintALTER TABLE ItemsADD CONSTRAINT IX_ItemID UNIQUE(ItemID)-- Create a ROWGUIDCOLALTER TABLE ItemsALTER COLUMN ItemID ADD ROWGUIDCOL65

Chapter 2: Getting Started with FILESTREAM--NOT NULL UNIQUE-- Add FILESTREAM columnALTER TABLE [dbo].[Items]ADD ItemImage VARBINARY(MAX) FILESTREAM NULLListing 2-12:Configures a UNIQUEIDENTIFIER column to make the table FILESTREAM ready.Converting VARBINARY(MAX) columns toFILESTREAM and vice versaOne of the common requirements found on environments where an older versionof SQL Server is migrated to SQL Server 2008 or later is to convert the existingVARBINARY(MAX) columns to FILESTREAM. It is also quite possible that you would wantto convert a FILESTREAM column to VARBINARY(MAX), for some reason. For example,if it turns out that most of the values being stored in the FILESTREAM column are small(less than 1 MB) then it may be beneficial to convert the column to VARBINARY(MAX),since FILESTREAM columns only perform better for larger data. However, this is adecision that can be taken only after performing adequate performance tests.Unfortunately, changing the FILESTREAM attribute is not just a simple metadataoperation; SQL Server does not permit you to change the FILESTREAM attribute of anexisting column (the ALTER COLUMN command cannot SET or CLEAR the FILESTREAMflag on a column).When you change a VARBINARY(MAX) column to FILESTREAM, the data in the columnneeds to be moved to the FILESTREAM data container in the NTFS. Similarly, when theFILESTREAM attribute is removed, the FILESTREAM data should be moved from the filesystem into the data files of the database. That explains why SQL Server does not allowyou to modify the FILESTREAM flag of a VARBINARY(MAX) column.66

Chapter 2: Getting Started with FILESTREAMAs such, the following steps are required, in case you want to convert an existingVARBINARY(MAX) column to FILESTREAM.1. Create a new VARBINARY(MAX) column with the FILESTREAM attribute.2. Copy the data from the VARBINARY(MAX) column to the FILESTREAM column usingan UPDATE statement. This will create a FILESTREAM file for each non-NULL valuebeing copied to the new column.3. Drop the original VARBINARY(MAX) column.4. Rename the new FILESTREAM column to the original name.The T-SQL script in Listing 2-13 demonstrates how to do this.-- Add FILESTREAM columnALTER TABLE [dbo].[Items]ADD ItemImage_New VARBINARY(MAX) FILESTREAM NULL-- Copy data to the new columnUPDATE [dbo].[Items] SETItemImage_New = ItemImage-- Drop the original columnALTER TABLE [dbo].[Items]DROP COLUMN ItemImage-- Rename the new column to original nameEXECUTE sp_rename@objname = '[dbo].[Items].[ItemImage_New]',@newname = 'ItemImage',@objtype = 'COLUMN'Listing 2-13:Altering a VARBINARY(MAX) column to a FILESTREAM column.A similar approach can be used to change a FILESTREAM column to a regularVARBINARY(MAX) column.67

Chapter 2: Getting Started with FILESTREAMNote that, after deleting the FILESTREAM column, the table will still be associated withthe FILESTREAM filegroup. Even if you delete all the FILESTREAM columns and tryto drop the FILESTREAM file or filegroup, you will get an error stating that the file orfilegroup is not empty. After dropping all the FILESTREAM columns, you have to setFILESTREAM_ON to NULL, using ALTER TABLE … SET (FILESTREAM_ON = "NULL").If you have created FILESTREAM columns on a table and later on decide to move backto VARBINARY(MAX), you might also want to drop the GUID column you created onthe table. This is relevant in cases where the GUID column was added only to fulfill theFILESTREAM requirement.Tables and FILESTREAM filegroupsEach table that contains FILESTREAM columns must be associated with at least oneFILESTREAM filegroup. A table can be associated with more than one FILESTREAMfilegroup if it is partitioned.As discussed earlier, depending upon your design requirements, all tables can beassociated with a single FILESTREAM filegroup, or they can each be associated witha separate filegroup (or some combination). This decision may be made based on theFILESTREAM load that you expect to have in each table; using separate filegroups candistribute the FILESTREAM I/O load if they are located on different devices.You can specify the FILESTREAM filegroup in which you would like to store theFILESTREAM data when you create a table. If you do not specify the FILESTREAMfilegroup, SQL Server will associate the default FILESTREAM filegroup with that tableand all the FILESTREAM data for that table will be stored there. When a table isassociated with a FILESTREAM filegroup, all the FILESTREAM data from allFILESTREAM columns that exist in that table will go to the same filegroup. It is notpossible to configure two FILESTREAM columns in the same table to store their data intwo separate FILESTREAM filegroups.68

Chapter 2: Getting Started with FILESTREAMListing 2-7 demonstrated how to create a FILESTREAM-enabled table called Itemsthat uses the default FILESTREAM filegroup. If you do not want to use the defaultFILESTREAM filegroup, you use the FILESTREAM_ON clause of the CREATE TABLEstatement to specify the filegroup, as shown in Listing 2-14.CREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL) FILESTREAM_ON [NorthPole_fs]Listing 2-14:Creating a table with FILESTREAM columns using a specified filegroup.For an example that shows how to change the FILESTREAM filegroup associated with anexisting table, see the section Creating a database with multiple FILESTREAM filegroups,later in this chapter.FILESTREAM filegroup queriesThere are some system views that you can use to query FILESTREAM-related informationon your SQL Server 2008 database. We will see many of them in the coming chapters, buta couple of examples are included here.You might need to find out the FILESTREAM filegroup that is associated with a specifictable. You can do this by querying the catalog view sys.data_spaces as shown inListing 2-15.69

Chapter 2: Getting Started with FILESTREAMSELECT d.name AS [FILESTREAM Filegroup]FROM sys.tables tINNER JOIN sys.data_spaces d ON t.filestream_data_space_id = d.data_space_idWHERE t.name = 'Items'/*FILESTREAM Filegroup---------------------------------------------------NorthPole_fs*/Listing 2-15:Retrieving the FILESTREAM filegroup associated with a table.Listing 2-16 queries the catalog view sys.filegroups and retrieves the defaultFILESTREAM filegroup of the current database.SELECT nameFROM sys.filegroupsWHERE type_desc = 'FILESTREAM_DATA_FILEGROUP'AND is_default = 1/*name-----------------------------------------------NorthPole_fs*/Listing 2-16:Retrieving the FILESTREAM filegroup of the current database.FILESTREAM Data Manipulation Using T-SQLWe have so far spent quite some time discussing FILESTREAM-enabled databases,FILESTREAM filegroups, and tables with FILESTREAM columns. Enough DDL fornow! It is time to see some DML operations.70

Chapter 2: Getting Started with FILESTREAMA T-SQL batch or stored procedure can access FILESTREAM columns as if they areregular VARBINARY(MAX) columns. However, accessing large BLOB data stored in aFILESTREAM column from T-SQL does not perform well, for the reasons below.• One of the features that FILESTREAM offers is the I/O streaming capability for fasterFILESTREAM operations. When accessing FILESTREAM data through T-SQL, thebenefits of streaming I/O are not available.• When accessing FILESTREAM data using T-SQL, SQL Server memory is usedfor buffering the FILESTREAM data. This will add memory overhead to the SQLServer instance.• To T-SQL, a FILESTREAM column appears as a VARBINARY(MAX) column with a2 GB size limitation. Therefore, if the data stored in the FILESTREAM column islarger than 2 GB, T-SQL can only access the first 2 GB of it.Therefore, for best performance and optimum usage of the SQL Server resources,FILESTREAM data should not generally be accessed using T-SQL. Instead, we shoulduse the streaming APIs exposed by SQL Server, as described in Chapter 3: AccessingFILESTREAM Data from Client Applications.However, as always, there are exceptions to the rule. Most significantly, unless theFILESTREAM column is assigned a default value, then a client application has to executea T-SQL INSERT statement in order to add a new row into a FILESTREAM-enabledtable, rather than use the streaming APIs. The reason for this is that streaming accesscan be made only to rows that contain non-NULL BLOB data in the FILESTREAMcolumn. So, T-SQL INSERT statements on FILESTREAM-enabled tables usually storedummy VARBINARY data in the FILESTREAM column, or a default value can be set onthe FILESTREAM column. For example, we could insert a dummy value, or use a defaultvalue, of 0x. This will create a zero-length file, which can then be replaced by actualFILESTREAM BLOB data using streaming I/O (or T-SQL).71

Chapter 2: Getting Started with FILESTREAMBeyond this, there are a few cases where it is acceptable to access and manipulate datausing T-SQL. In most cases, it is fine to store small BLOB data values in a FILESTREAMcolumn and retrieve it using T-SQL. The question you might ask now is, "What size isconsidered to be small?" There is no absolute definition but, generally speaking, anythingbelow 1 MB may be considered to be small; please refer to the Microsoft Research Paperavailable at http://brurl.com/fs2 for a more in-depth discussion that demonstrateshow the BLOB size affects performance when stored inside the database as well as in thefile system. If the BLOB data being stored, updated, or retrieved from the FILESTREAMstorage is large (generally speaking, bigger than 1 MB) it is not advisable to manipulate thedata using T-SQL.The following operations require T-SQL because there are no equivalent operations inthe streaming APIs:• deleting rows from a table with FILESTREAM columns• truncating a FILESTREAM-enabled table• updating one or more FILESTREAM columns to NULL.Inserting a row with FILESTREAM dataWe're going to walk through an example of creating a new FILESTREAM columnand then inserting an image into that column using T-SQL. We'll then explore thecontents of the FILESTREAM data container, in order to gain some insight into howthis data is stored.Please note, right from the start, that this example is for illustrative and learning purposesonly. Firstly, the most benefit is gained from the FILESTREAM feature when accessing itusing the streaming APIs instead of T-SQL, as we'll discuss in Chapter 3. Secondly, youwould never, in a production system, delve directly into the contents of the FILESTREAMdata container, as we will do here.72

Chapter 2: Getting Started with FILESTREAMWith this in mind, let's get started! If you have followed all the sample scripts so farand executed them in SSMS, you will have a FILESTREAM-enabled table containinga FILESTREAM column. So, before we move ahead with storing BLOB data in theFILESTREAM column of our table, let us rebuild it to make sure that everything is set upcorrectly (Listing 2-17).-- Drop the table if existsIF OBJECT_ID('Items', 'U') IS NOT NULLBEGINDROP TABLE ItemsENDGO-- Create the tableCREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL)GOListing 2-17:Creating a table with a FILESTREAM column.SQL Server 2005 enhanced the OPENROWSET() function with a very powerful attribute:BULK. The BULK attribute can be used to load the content of an image file on disk into aVARBINARY(MAX) column, and then insert it into a FILESTREAM column.We'll use the BULK attribute to load the image of a Microsoft mouse (Figure 2-4) into theFILESTREAM column of the Items table. (You can find this image in the sample imagedownload file available at http://brurl.com/fs4 and extract it into C:\TEMP folder if youdo not want to change the code given in the sample snippet.)73

Chapter 2: Getting Started with FILESTREAMFigure 2-4:Image of Microsoft mouse to be loaded into a FILESTREAM column.Note again that, in most real-world scenarios, we would not insert this image directlyusing T-SQL. Instead, we'd insert a record with a dummy (generally 0x) value into theFILESTREAM column, and then update the column with the required image using thestreaming APIs.Listing 2-18 shows the T-SQL script to load the image.-- Declare a variable to store the image dataDECLARE @img AS VARBINARY(MAX)-- Load the image dataSELECT @img = CAST(bulkcolumn AS VARBINARY(MAX))FROM OPENROWSET(BULK 'C:\temp\MicrosoftMouse.jpg', SINGLE_BLOB) AS x-- Insert the data to the tableINSERT INTO Items( ItemID ,ItemNumber ,ItemDescription ,ItemImage)SELECT NEWID() ,'MS1001' ,'Microsoft Mouse' ,@imgListing 2-18:Loading the content of a disk file into a VARBINARY(MAX) column and insertinginto a FILESTREAM column.74

Chapter 2: Getting Started with FILESTREAMLoading the content to the VARBINARY(MAX) variable is unnecessary in the aboveexample. This additional step was added to make the code easier to follow. Listing 2-19shows another version of the code that does not use a staging variable, but instead insertsthe data directly into the table.-- Insert the data to the tableINSERT INTO Items( ItemID ,ItemNumber ,ItemDescription ,ItemImage)SELECT NEWID() ,'MS1001' ,'Microsoft Mouse' ,CAST(bulkcolumn AS VARBINARY(MAX))FROM OPENROWSET(BULK 'C:\temp\MicrosoftMouse.jpg', SINGLE_BLOB) AS xListing 2-19:Loading a disk file directly into a FILESTREAM column.So, where did the data go? Well, the data we just loaded into the FILESTREAM columngoes to the FILESTREAM data container. The exact content of the source file is copiedto the correct location within the FILESTREAM data container; when SQL Server storesBLOB data in the FILESTREAM data container, an exact copy of the original binary chunkis stored in the file system. SQL Server does not change the content; nothing is added,modified, or removed. Also, there is a specific location for every FILESTREAM column inthe database. Listing 2-20 shows the file system map.C:\|-ArtOfFS\|-Demos\|-Chapter2\|-NorthPole_fs\-- Filestream Data Container|-$FSLOG\-- Log Folder|-filestream.hdr -- Metadata File|-5a032731…dfeff9\ -- Root folder for "Items" table|-e943c1e6…5fc1f5\ -- Folder for "ItemImage" columnListing 2-20:Sample FILESTREAM data container structure.75

Chapter 2: Getting Started with FILESTREAMSo, in this example, the image that was loaded into the FILESTREAM column of theItems table is stored in the file system folder e943c1e6…5c1f5. Note that the name ofthe folder may be different if you are following this example on your computer, but thehierarchy will be the same.Let's get adventurous and explore this folder. But before that, a statutory warning!In a production system, never touch the FILESTREAM files directly!Never update the FILESTREAM data directly. I am showing this here only for demonstration purposes,and it should never be done on a production system. If you do, you may corrupt your database. TheFILESTREAM data files should be accessed only using the Win32 interface exposed by SQL Server, asexplained in Chapter 3: Accessing FILESTREAM Data from Client Applications.Bearing in mind the risks of playing with the FILESTREAM data container, let's go aheadand do it anyway, as it will shed some light on how the files are stored in the file system. Ifwe navigate to the file system location for the ItemImage column, we will see somethingvery similar to Figure 2-5.Figure 2-5:The contents of a FILESTREAM data container.76

Chapter 2: Getting Started with FILESTREAMA new file has been created inside the subfolder that SQL Server created for theItemImage column. The content of the file will be the same as the content of theoriginal file. To make sure that we don't corrupt the database, we'll copy this file toanother folder and open it in MS Paint to view its contents. The file opens correctly(Figure 2-6). This indicates that SQL Server stores the content of the file "as is" in theFILESTREAM data file.Figure 2-6:Opening the FILESTREAM data in MS Paint.77

Chapter 2: Getting Started with FILESTREAMUpdating and deleting FILESTREAM dataOnly rarely will you modify FILESTREAM data using T-SQL, since it is much moreefficient to do it using the streaming APIs, as demonstrated in Chapter 3. Note that SQLServer does not support a partial update to the FILESTREAM data. If you want to makechanges to your BLOB data, the entire content must be replaced.However, it is more common to delete FILESTREAM data from a column by setting it toNULL, using T-SQL.UPDATE Items SETItemImage = NULLListing 2-21:Updating a FILESTREAM column to NULL to delete the data.The BLOB file is not removed from the FILESTREAM data container immediately. Thisis done by a garbage collector background thread, so there is a delay before the physicalfile is deleted from the file system. You can force a garbage collection using the CHECK-POINT command but that does not guarantee that the garbage collector will remove theunwanted file. The garbage collector will remove the unwanted FILESTREAM data filesonly after the transaction log is truncated past the log entry that created the garbage files.If you intend to delete the entire row containing the FILESTREAM data from the table,a regular DELETE command will delete the row from the table and remove all theFILESTREAM data associated with that record from the FILESTREAM data container.Again, there may be a delay before the physical data is deleted from the data container bythe garbage collector.78

Chapter 2: Getting Started with FILESTREAMFILESTREAM and triggersJust like other columns, FILESTREAM columns are accessible from within DMLtriggers. The FILESTREAM columns can be accessed from a trigger in the same way asa regular VARBINARY(MAX) column. DML triggers get fired even when modifying theFILESTREAM data using Win32 I/O streaming.It is not good practice to print debug information from a trigger. However, for thepurpose of showing that the FILESTREAM columns are accessible from a trigger, we'llcreate an INSERT trigger that prints the size of the FILESTREAM data value beinginserted (Listing 2-22).CREATE TRIGGER TGR_Items ON dbo.ItemsAFTER INSERTASBEGINDECLARE @size INTSELECT @size = DATALENGTH(ItemImage)FROM INSERTEDPRINT 'FILESTREAM Size :' + STR(@size)ENDListing 2-22:An INSERT trigger that prints the size of the FILESTREAM data value being inserted.Now we'll insert a new record into the FILESTREAM column of the Items table using thescript in Listing 2-23.-- Insert the data to the tableINSERT INTO Items( ItemID ,ItemNumber ,ItemDescription ,ItemImage)79

Chapter 2: Getting Started with FILESTREAMSELECT NEWID() ,'MS1001' ,'Microsoft Mouse' ,CAST(bulkcolumn AS VARBINARY(MAX))FROM OPENROWSET(BULK 'C:\temp\MicrosoftMouse.jpg', SINGLE_BLOB) AS x/*OUTPUT From Trigger:FILESTREAM Size : 2288*/Listing 2-23:Inserting a row into the Items table with FILESTREAM data.The code within the trigger prints the size of the FILESTREAM data value when the newvalue is inserted.Advanced FILESTREAM DDLIn this section, we'll briefly review some slightly more advanced requirements forFILESTREAM-enabled databases and tables, namely using FILESTREAM with partitionedtables, and creating databases with multiple FILESTREAM filegroups, for scalability.The FILESTREAM data container and partitionedtablesWe saw earlier that SQL Server creates a folder at the root of the FILESTREAM datacontainer for every FILESTREAM-enabled table in the FILESTREAM database. If a table ispartitioned, a folder will be created for every partition in the table. Within each of thesefolders, a subfolder will be created for every FILESTREAM column in the table.80

Chapter 2: Getting Started with FILESTREAMLet's create a partitioned table and see this in action. First, we'll drop the table we createdearlier by running the script shown in Listing 2-24.IF OBJECT_ID('Items', 'U') IS NOT NULLBEGINDROP TABLE ItemsENDGOListing 2-24:Dropping the Items table.When you drop a table with FILESTREAM columns, the FILESTREAM data containerfor that table is removed from the FILESTREAM data container. Note, however, thatthis is done by a garbage collector and you might experience some delay in getting thefile removed.If we now go to the FILESTREAM data container and take a look at the content of thefolder, the FILESTREAM folders have been removed and the folder is empty except for themetadata objects (Figure 2-7).Figure 2-7:Verifying that the FILESTREAM data container contains only metadata.81

Chapter 2: Getting Started with FILESTREAMNow we'll create a partitioned table. The script in Listing 2-25 creates a new version ofthe Items table with three partitions. (Remember that the aim of this script is to create aFILESTREAM table with three partitions for demonstration purposes; the way this table ispartitioned does not make much business sense.)-- Create partition functionCREATE PARTITION FUNCTIONItemPartitionFn (INT)AS RANGE RIGHT FOR VALUES (1, 2) ;-- Create partition scheme for ItemIDCREATE PARTITION SCHEMEItemPartitionSchAS PARTITION ItemPartitionFn ALL TO ([PRIMARY]) ;-- Create partition scheme for FILESTREAMCREATE PARTITION SCHEMEItemFSPartitionSchAS PARTITION ItemPartitionFn ALL TO (Northpole_fs) ;-- Create Item table with FILESTREAM columnCREATE TABLE Items(ItemGUID UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ON [PRIMARY] ,ItemID INT ,ItemName VARCHAR(50) ,ItemImage VARBINARY(MAX) FILESTREAM)ON ItemPartitionSch(ItemID) FILESTREAM_ON ItemFSPartitionSch ;GOListing 2-25:Creating a FILESTREAM table with PARTITION.After running the script, if we return to the FILESTREAM data container, there are threenew folders; one for each partition in the Items table (Figure 2-8).82

Chapter 2: Getting Started with FILESTREAMFigure 2-8:Data container showing a folder for each partition in the Items table.Creating a database with multiple FILESTREAMfilegroupsDepending upon your database design requirements, you might decide to have morethan one FILESTREAM filegroup. If you have a few tables for which you expect a veryhigh volume of FILESTREAM reads and writes, you might decide to have FILESTREAMfilegroups on different disk volumes, and set up each table to use a different disk for itsFILESTREAM storage. This will give you more throughput as each disk can be used inparallel.The script in Listing 2-26 creates a database with two FILESTREAM filegroups.83

Chapter 2: Getting Started with FILESTREAMCREATE DATABASE NorthPole ON PRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole.mdf'), FILEGROUP NorthPole_fs1 CONTAINS FILESTREAM DEFAULT(NAME = NorthPole_fs1,FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole_fs1'), FILEGROUP NorthPole_fs2 CONTAINS FILESTREAM(NAME = NorthPole_fs2,FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole_fs2') LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter2\NorthPole_log.ldf')GOListing 2-26:Creating a database with two FILESTREAM filegroups.Note that, if you have more than one FILESTREAM filegroup, you need to specifyone of them as the default filegroup; SQL Server will use the default filegroup for theFILESTREAM storage if you create a table without specifying the FILESTREAM filegroup.In Listing 2-29, NorthPole_fs1 is specified as the default.To specify the FILESTREAM filegroup for the table you are creating, use theFILESTREAM_ON clause, as shown in Listing 2-27, in which two tables are created, eachusing a different filegroup.-- Store FILESTREAM data on "NorthPole_fs1"CREATE TABLE [dbo].[Items1]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL) FILESTREAM_ON NorthPole_fs1GO84

Chapter 2: Getting Started with FILESTREAM-- Store FILESTREAM data on "NorthPole_fs2"CREATE TABLE [dbo].[Items2]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL) FILESTREAM_ON NorthPole_fs2GOListing 2-27:Creating two tables that store the FILESTREAM data in different filegroups.You can change the FILESTREAM filegroup of a table even after the table is created.However, this is possible only if the table does not contain any FILESTREAMcolumns. Therefore, you must remove any FILESTREAM columns before you reset theFILESTREAM_ON flag, as shown in Listing 2-28.-- Store FILESTREAM data on "NorthPole_fs1"CREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL) FILESTREAM_ON NorthPole_fs1GO-- Drop the FILESTREAM columnALTER TABLE Items DROP COLUMN ItemImage-- Reset the FILESTREAM_ON flagALTER TABLE Items SET(FILESTREAM_ON = "NULL")85

Chapter 2: Getting Started with FILESTREAM-- Change the FILESTREAM filegroupALTER TABLE Items SET(FILESTREAM_ON = NorthPole_fs2)-- Add the FILESTREAM column backALTER TABLE ItemsADD [ItemImage] VARBINARY(MAX) FILESTREAM NULLListing 2-28:Changing the FILESTREAM filegroup of a table.Disabling FILESTREAM Storage on a DatabaseDisabling FILESTREAM storage on a database involves removing the FILESTREAMfiles and filegroups from the database. This can be done by using an ALTER DATABASEstatement.If the database contains any tables that use FILESTREAM storage, all references tothe FILESTREAM filegroups must be removed before the FILESTREAM filegroups areremoved. The procedure is summarized below.1. Drop all FILESTREAM columns from all tables using ALTER TABLE.2. Set the FILESTREAM_ON attribute on all the tables to NULL using ALTER TABLE.3. Remove the FILESTREAM file using ALTER DATABASE.4. Remove the FILESTREAM filegroup using ALTER DATABASE.Assume that the NorthPole database has a table with a FILESTREAM column, createdusing Listing 2-7. SQL Server will not allow the FILESTREAM filegroup to be removeduntil all the other steps have been made.86

Chapter 2: Getting Started with FILESTREAMFirst, you have to drop the FILESTREAM column from the table (Listing 2-29).-- Drop the FILESTREAM columnALTER TABLE Items DROP COLUMN ItemImageListing 2-29:Dropping the FILESTREAM column from the Items table.Note that the association between the FILESTREAM filegroup and the table remains, evenafter dropping the FILESTREAM column; SQL Server still thinks that the FILESTREAMfilegroup is not empty. The table is still linked to its FILESTREAM filegroup even thoughthere are no FILESTREAM columns in the table.Next, you must disassociate the table from its FILESTREAM filegroup by setting theFILESTREAM_ON attribute to NULL. Listing 2-30 shows how to do this.-- Disassociate table "items" from its FILESTREAM filegroupALTER TABLE Items SET (FILESTREAM_ON = "NULL")Listing 2-30:Disassociating a table from its FILESTREAM filegroup.Now you can remove the FILESTREAM file and filegroup using ALTER DATABASE. Listing2-31 shows how to remove the FILESTREAM capabilities from the example database.-- Remove the FileALTER DATABASE NorthPole REMOVE FILE NorthPoleFS ;-- Remove the FilegroupALTER DATABASE NorthPole REMOVE FILEGROUP NorthPoleFS ;Listing 2-31:Removing FILESTREAM features from a database.87

Chapter 2: Getting Started with FILESTREAMIf a database does not contain any tables that use FILESTREAM storage, removing theFILESTREAM feature from a database is much simpler, and comprises only two steps:1. Remove the FILESTREAM data container.2. Remove the FILESTREAM filegroup.SummaryIn this chapter, we started with an exploration of the world of FILESTREAM datacontainers and filegroups, and a look at how to create FILESTREAM-enabled databases.The FILESTREAM data container is the file system location in which SQL Server storesthe BLOB data in a FILESTREAM-enabled database. SQL Server stores FILESTREAM datain special filegroups known as FILESTREAM filegroups.We saw that a database can have more than one FILESTREAM filegroup, but that oneof the filegroups should be marked as the default filegroup. When you create a tablewith FILESTREAM columns you can specify the FILEGROUP in which to store theFILESTREAM data for that table. If the table has multiple FILESTREAM columns, all ofthem will be stored in the same filegroup, which will be the default filegroup, unless youspecify otherwise.We also looked at how to insert data into our FILESTREAM columns. Unless a DEFAULTvalue is assigned to a FILESTREAM column, when creating the table, the data mustbe initially inserted using T-SQL. However, inserting or updating FILESTREAM datausing T-SQL may not be beneficial in terms of performance, since SQL Server's memorybuffers are used for processing the BLOB data, which might create memory pressure onproduction systems. As such, it is better to set an initial default value on the FILESTREAMcolumn, and then modify the data to the required values using the Win32 streaming APIsexposed by SQL Server, which we'll discuss in the next chapter.Finally, we looked at how to disable FILESTREAM storage, noting in particular thatsetting the FILESTREAM column to NULL will delete the associated FILESTREAM datafrom the FILESTREAM data container.88

Chapter 3: Accessing FILESTREAMData from Client ApplicationsAs discussed in the previous chapters, FILESTREAM storage is ideal for BLOB data valuesthat are relatively large (larger than 1 MB). This data can be accessed either throughT-SQL code or through the FILESTREAM APIs exposed by SQL Server, though the latteris more efficient than the former.The FILESTREAM APIs come in two flavors, Win32 and .NET, and in this chapter we'llwalk through several examples using each of these I/O streaming APIs.What is Streaming?Streaming, in simple terms, means "continuous flow." Within the context of reading andwriting BLOB data, it refers to a continuous flow of bytes to and from the physical disk,without local buffering (i.e. without collecting bytes in an intermediate location beforetransmitting the data to the target). Performing FILESTREAM operations with streamingoffers two immediate benefits. Firstly, it means that only a small memory buffer is needed.Without streaming, potentially large files will be stored in the memory buffer. This couldoverload the server by incurring excessive paging and loss of cached execution plans. Itcould also affect other processes running on the server. It is also possible that the file willbe bigger than the available memory on the server.Secondly, streaming gives better performance in several cases. Your process can startsending bytes to the target location even before the entire contents of the file are readfrom the source location. This performance benefit may be particularly evident if you areserving the BLOB data to a client over the Internet (such as streaming an audio file ordownloading a file from a client browser).89

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsUnderstanding Streaming Access toFILESTREAM DataIn order to gain streaming access to the FILESTREAM data, SQL Server exposes amanaged object for the .NET developers and a Win32 API for the Windows programmers.Having instantiated an SqlFileStream object, representing the FILESTREAM datafile (in case of .NET), or opened a FILESTREAM data file (using the SQL Server Win32API), you can read or write to the FILESTREAM data file using the FILE manipulationfunctions provided by the client library or the .NET programming language.So how does this differ from the traditional model of storing the BLOB data in the filesystem and storing the path to the data file in a relational table? How does SQL Servermaintain transactional consistency between the relational and the FILESTREAM data?Well, when using the FILESTREAM streaming APIs, you are not accessing theFILESTREAM data directly; SQL Server opens the file handle and passes it to you throughits own wrapper APIs. This allows SQL Server to keep total control over the operationsand maintain transactional consistency between the relational data in the tables and theBLOB data in the FILESTREAM store.Accessing FILESTREAM data through the streaming APIs involves the steps below,irrespective of the client library or programming language you use.1. Start a transaction – You need a transaction to gain streaming access to theFILESTREAM data.2. Retrieve the path name and transaction context – You need to obtain a logical pathname to the FILESTREAM data file and a pointer to the current transaction contextto access the FILESTREAM data.3. Open the FILESTREAM data file – Gain access to the FILESTREAM data file eitherusing the .NET wrapper class or the Win32 FILESTREAM API.90

Chapter 3: Accessing FILESTREAM Data from Client Applications4. Read/Write FILESTREAM data – Read or write FILESTREAM data using the file I/Ofunctions.5. Close the FILESTREAM data file – Gracefully close the FILESTREAM data file.6. Commit/rollback the transaction – Commit or roll back the transaction when youhave finished.Let's examine each of these steps in detail.Step 1: Starting a transactionAll streaming access to FILESTREAM data, even just read access, must be done withinthe context of a database transaction so, before accessing the FILESTREAM data, yourapplication should start a SQL Server transaction.The exact manner in which you start a transaction on the current database connectionwill vary depending on the client library in use (ADO.NET, Open Database Connectivity,etc.). There is nothing FILESTREAM-specific in this step, you can just begin a transactionin the same way as for a T-SQL DML operation.We will see more detailed examples and fully functional sample code later in thischapter but, for now, Listing 3-1 shows a simple example of how to start a transactionon an ADO.NET connection object using C#.// Create a SqlConnection objectSqlConnection con = new SqlConnection("database=NorthPole;server=(local);integrated security=sspi;");//Open a connection to the databasecon.Open();//Start a transactionSqlTransaction transaction = con.BeginTransaction("ItemTran");Listing 3-1:Using C# to start a transaction on an ADO.NET connection object.91

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsNon-.NET programming languages like C++ can use the Win32 Open Database Connectivity(ODBC) APIs to create a connection to the desired database and begin a databasetransaction. Listing 3-2 shows an example.//Connect to SQL Server through ODBC using DSN "NorthPole"SQLConnect(hdbc, TEXT("NorthPole"), SQL_NTS, NULL, 0, NULL, 0);//Start a transaction by setting AUTO_COMMIT attribute to OFFSQLSetConnectAttr(hdbc, SQL_ATTR_AUTOCOMMIT, (SQLPOINTER)SQL_AUTOCOMMIT_OFF,SQL_IS_UINTEGER);Listing 3-2:Using C++ to start a transaction on an ODBC connection.When the AUTO_COMMIT attribute is turned OFF on an ODBC connection, the nextT-SQL statement starts an explicit transaction. The transaction stays active until SQL_COMMIT or SQL_ROLLBACK is called explicitly using SQLEndTran().Step 2: Retrieving the logical path name andtransaction contextBefore accessing a FILESTREAM data value, we must retrieve, from SQL Server, thelogical path to the disk file associated with the FILESTREAM data, and the transactioncontext that represents the current database transaction, and then include these values inthe data access request.FILESTREAM data values and cellsIn this chapter, the terms FILESTREAM cell and FILESTREAM data value refer, respectively, to a specificcell in a FILESTREAM table (i.e. the intersection of a given row and column) and the BLOB data storedin a FILESTREAM cell.92

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsPathName() is a new intrinsic function available on the FILESTREAM-enabled columnof a SQL Server 2008 database. Note that, unlike other T-SQL functions, PathName()is case sensitive. It returns the logical path to the physical file associated with aFILESTREAM data value. The example in Listing 3-3 retrieves the PathName() of theItemImage column from the Items table. The PathName() value returned by eachFILESTREAM cell will be unique, pointing to the specific disk file associated with the datain the given FILESTREAM cell.SELECTItemNumber,ItemImage.PathName() AS LogicalPathFROM Items/*Item# LogicalPath------ --------------------------------------------------MS1001 \\JACOB-LAPTOP\SQL2008R2NOV\v1\

Chapter 3: Accessing FILESTREAM Data from Client Applicationsneeds to be sent to SQL Server, along with the request, in order to gain streaming accessto a FILESTREAM data value. Listing 3-4 shows an example.-- Start an explicit transactionBEGIN TRAN-- returns the FILESTREAM TRANSACTION CONTEXT associated with the current sessionSELECT GET_FILESTREAM_TRANSACTION_CONTEXT()/*output----------------------------------0xEC138B4813D41841A7360E5236D7CEDB*/Listing 3-4:Using T-SQL to retrieve the FILESTREAM transaction context.If an active transaction context exists in the current session, then the return value isa byte array of type VARBINARY(MAX). If there is no active transaction in the currentsession, NULL is returned.Note that, although the example in Listing 3-4 demonstrates how to call the GET_FILESTREAM_TRANSACTION_CONTEXT() function using T-SQL, the return value ismeaningless in T-SQL; it has meaning only to a client application that wants to access theFILESTREAM data using the streaming APIs.Listing 3-5 shows how to retrieve the current transaction context and the logical path tothe FILESTREAM data value prior to requesting streaming access, from C#.// Retrieve the PathName() and transaction contextSqlCommand cmd = new SqlCommand();cmd.Connection = con;cmd.CommandText = ("SELECT TOP 1 "+ (" ItemImage.PathName() AS filePath, "+ (" GET_FILESTREAM_TRANSACTION_CONTEXT() AS txContext "+ ("FROM Items"))));94

Chapter 3: Accessing FILESTREAM Data from Client Applications// Begin a transactionSqlTransaction trn = con.BeginTransaction("ItemTran");cmd.Transaction = trn;// Execute the querySqlDataReader reader;reader = cmd.ExecuteReader();Listing 3-5:Using C# to retrieve the transaction context and logical path name ofa FILESTREAM data value.Step 3: Opening the FILESTREAM data fileThe next step is to open the FILESTREAM data file, using the wrapper function/classprovided by SQL Server. Direct access to the physical files stored in the FILESTREAM datacontainer can corrupt the database, so it's always advisable to access the data throughthese streaming APIs..NET developers can use the SqlFileStream object to access the FILESTREAM data.SqlFileStream exposes SQL Server data that is stored with the FILESTREAM columnas a sequence of bytes. It resides within the namespace System.Data.SqlTypes andwithin the assembly System.Data.dll. You must have .NET Framework 3.5 SP 1 or laterto use SqlFileStream class.SqlFileStream is derived from System.IO.Stream. Once you have a validSqlFileStream object representing the desired FILESTREAM data file, you canuse the Read(), Write() or the other "stream" functions available within the class tomanipulate your FILESTREAM data.Listing 3-6 shows how to open a FILESTREAM data file for reading.95

Chapter 3: Accessing FILESTREAM Data from Client Applications// Open and read file using SqlFileStream Classstring filePath = reader.GetString(0);object objContext = reader.GetValue(1);byte[] txContext = (byte[])objContext;SqlFileStream fs = new SqlFileStream(filePath, txContext, FileAccess.Read);Listing 3-6:Using C# to open a FILESTREAM data file for read access.The parameter filePath is the logical path name of the FILESTREAM data file weobtained using the PathName() method call, and txContext is the FILESTREAMtransaction context associated with the current transaction.The third parameter specifies the file access mode. It can be one of Read, Write, orReadWrite. Read opens the FILESTREAM data file for reading all the content. WhenWrite is specified, the SqlFileStream object will point to a zero-byte file and existingdata will be overwritten when the object is closed and transaction is committed. WhenReadWrite is used, the file is opened for both reading and writing, and the file pointeris placed at the beginning of the file. By using the Seek() method, you can move the filepointer to any location in the file to overwrite the existing content, or you can move itto the end of the file to append more data. See SqlFileStream Class Reference, later in thischapter, for a detailed discussion of the different constructors and member functions ofthe SqlFileStream class.Non .NET programming languages can use the FILESTREAM Win32 API to obtain aWindows file handle to the requested FILESTREAM data file, and use the file handleto read or write FILESTREAM data. SQL Server 2008, or later, Native Client exposes anew Win32 API, OpenSqlFilestream(), which returns a Windows file handle to therequested FILESTREAM data file. Just like the .NET example we saw earlier, you need topass the FILESTREAM transaction context and logical path to the desired FILESTREAMdata value when calling this function, as shown in Listing 3-7.96

Chapter 3: Accessing FILESTREAM Data from Client Applications//Get a handle to the FILESTREAM data using OpenSqlFileStream APISQLCHAR transactionToken[32];SQLINTEGER cbTransactionToken = sizeof(transactionToken);HANDLE srcHandle = INVALID_HANDLE_VALUE;srcHandle = OpenSqlFilestream (dstFilePath, SQL_FILESTREAM_READ,SQL_FILESTREAM_OPEN_NONE , transactionToken,cbTransactionToken, 0);Listing 3-7:Using C++ to open a FILESTREAM data file for read access.The first parameter is the logical path to the FILESTREAM data file obtained from a callto the PathName() function. The second argument specifies file access mode. It can beone of SQL_FILESTREAM_READ, SQL_FILESTREAM_WRITE, or SQL_FILESTREAM_READWRITE (for read, write, and read/write operations respectively).The third parameter allows you to specify the file attributes and flags. These areadditional hints that tell the system about the usage pattern, and the system will usethis information to optimize caching and performance. In Listing 3-7 we have set it toSQL_FILESTREAM_OPEN_NONE which indicates that the file needs to be opened with nospecific flags.The fourth argument is the transaction context we retrieved, and the fifth specifies thesize (in bytes) of the transaction context buffer.The last argument allows you to specify the initial allocation size of the data file in bytes;this is ignored when the file is opened for read operation.See OpenSqlFilestream API Reference, later in this chapter, for a detailed discussion of thisAPI function.97

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsStep 4: Reading and writing FILESTREAM dataOnce you have a valid FILESTREAM file handle (if you are using Win32 API) or a validSqlFileStream() object (if you are using Managed API) you can go ahead and reador write FILESTREAM data. The read/write operations you perform on the Win32handle or the SqlFileStream object will affect the disk file associated with the givenFILESTREAM data value.Listing 3-8 reads data from a FILESTREAM data file and copies it to a disk file. Thisexample targets .NET Framework Version 4 or later, which adds the CopyTo() methodto the Stream class. For earlier versions of the .NET Framework, one can write a helpermethod that will perform a similar buffered copy between two streams.//Open the FILESTREAM data fileSqlFileStream fs = new SqlFileStream(filePath, txContext, FileAccess.Read);//Create the target data fileFileStream fStream = new FileStream("C:\\temp\\MicrosoftMouse2.jpg",FileMode.CreateNew);//Transfer data from FILESTREAM data file to disk file using a 4KB buffer sizefs.CopyTo(fStream, 4096);//Close the source filefStream.Close();Listing 3-8:Using C# to read a FILESTREAM data file and write to a disk file.Listing 3-9 demonstrates a write operation where the data is read from a disk file andcopied into a FILESTREAM column.//Open the FILESTREAM data file for writingSqlFileStream fs = new SqlFileStream(filePath, txContext, FileAccess.Write);// Open the source file for readingFileStream fStream = new FileStream("C:\\temp\\MicrosoftMouse2.jpg",FileMode.Open, FileAccess.Read);98

Chapter 3: Accessing FILESTREAM Data from Client Applications//Transfer data from disk file into the FILESTREAM columnfStream.CopyTo(fs, 4096);//Close the source filefStream.Close();Listing 3-9:Using C# to read a disk file and write into a FILESTREAM column.It is apparent that the SqlFileStream object makes FILESTREAM data manipulationmuch simpler. It is a little bit more complicated when reading FILESTREAM data usingthe Win32 API function, as shown in Listing 3-10.//Get a handle to the FILESTREAM data using OpenSqlFileStream APIfsHandle = OpenSqlFilestream(fsFilePath, SQL_FILESTREAM_READ, 0, tkn, tknSize, 0);//Create the disk file to copy datafileHandle = CreateFile(TEXT("C:\\temp\\microsoftmouse2.jpg"), GENERIC_WRITE,FILE_SHARE_WRITE, NULL, CREATE_NEW, FILE_FLAG_SEQUENTIAL_SCAN,NULL);const int COPYBUFFERSIZE = 4096;DWORD bytesRead = 0;DWORD bytesWritten = 0;do{//Read from FILESTREAMReadFile(fsHandle, buffer, COPYBUFFERSIZE, &bytesRead, NULL);//Write to disk fileif (bytesRead > 0){WriteFile(fileHandle, buffer, bytesRead, &bytesWritten, NULL);}} while (bytesRead == COPYBUFFERSIZE);//close the disk file handleif ( fileHandle != INVALID_HANDLE_VALUE )CloseHandle(fileHandle);Listing 3-10:Using C++ to read a FILESTREAM data file and write to a disk file.99

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsListing 3-11 opens a disk file and copies the contents of the file into a FILESTREAMcolumn.//Get a handle to the FILESTREAM data using OpenSqlFileStream APIfsHandle = OpenSqlFilestream(dstFilePath, SQL_FILESTREAM_WRITE, 0, tkn, tknSize, 0);//Open the disk file to readfileHandle = CreateFile(TEXT("C:\\temp\\microsoftmouse2.jpg"), GENERIC_READ,FILE_SHARE_READ, NULL, OPEN_EXISTING,FILE_FLAG_SEQUENTIAL_SCAN, NULL);const int COPYBUFFERSIZE = 4096;DWORD bytesRead = 0;DWORD bytesWritten = 0;do{//Read from the disk fileReadFile(fileHandle, buffer, COPYBUFFERSIZE, &bytesRead, NULL);//Write to the FILESTREAM columnif (bytesRead > 0){WriteFile(fsHandle, buffer, bytesRead, &bytesWritten, NULL);}} while (bytesRead == COPYBUFFERSIZE);//close the disk file handle before closing the transactionif ( fileHandle != INVALID_HANDLE_VALUE ) CloseHandle(fileHandle);Listing 3-11:Using C++ to read from a disk file and write into a FILESTREAM column.100

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsStep 5: Closing the FILESTREAM data fileAfter the desired operation is completed, it is important to close the FILESTREAM datafile. The FILESTREAM handle should be closed before the transaction is committed orrolled back. Listing 3-12 shows how to close the FILESTREAM file handle.//close the file handlefs.Close();Listing 3-12:Using C# to close the FILESTREAM file handle.Any UPDATE triggers on the table will be fired when the FILESTREAM data file that wasopened for the Write operation is closed.Step 6: Closing the transaction (COMMIT/ROLLBACK)After closing the FILESTREAM data file, the current transaction should be committedor rolled back depending upon the status of the operation. When the transaction iscommitted, the changes made to the FILESTREAM data file will be permanently writtento the FILESTREAM data file. A rollback will undo the changes done in the current transaction.Listing 3-13 commits the current transaction.//Commit the transactioncmd.Transaction.Commit();Listing 3-13:Using C# to commit the transaction associated with an ADO.NET command object.101

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsData Manipulation Using the Streaming APIsOn a FILESTREAM-enabled table, a client application might perform one or more of thedatabase operations below.• Insert a new record into a FILESTREAM-enabled table.• Replace the FILESTREAM data completely.• Replace the FILESTREAM data partially.• Append more data to an existing FILESTREAM data value.• Read the information stored in a FILESTREAM column.• Delete the FILESTREAM data value from a column.• Delete a row from a FILESTREAM-enabled table.Some of these operations must be done using T-SQL, whereas some of them are bestdone using the streaming APIs. We'll examine each one in detail, and recommend thebest approach. We'll then present a worked lab that demonstrates how to insert, read, andpartially update FILESTREAM data via the streaming APIs.Inserting a new record into a FILESTREAM-enabledtableAs discussed in Chapter 2, unless the FILESTREAM column is assigned a default value, aclient application will need to execute a T-SQL INSERT statement in order to add a newrow into a FILESTREAM-enabled table, rather than use the streaming APIs. The reasonfor this is that streaming access can be made only to rows that contain non-NULL BLOBdata in the FILESTREAM column.102

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsTo avoid additional round trips to the database, the stored procedure or T-SQL batchthat is used to perform the INSERT operation can also return the FILESTREAM transactioncontext of the current transaction and the logical path to the FILESTREAM datafile associated with the newly inserted record. The client application can then use thesevalues to access the FILESTREAM data.Since it is not possible to use the streaming API to create a new FILESTREAM data fileand associate it with a FILESTREAM column, to create a new record and save the BLOBdata into the FILESTREAM column the client application must use two steps.1. Execute a T-SQL statement to create a new record along with a dummy FILESTREAMvalue. Inserting a fake value such as 0x into the FILESTREAM column will force SQLServer to create a FILESTREAM data file and associate it with the column.2. Once the FILESTREAM data file has been created, the client application can use thePathName() value of the FILESTREAM column in the newly inserted record to accessthe FILESTREAM data file and write the BLOB data into it.An alternative solution involves not allowing NULL values in the FILESTREAM columnand providing a default value of 0x. This ensures that the FILESTREAM cell will alwaysbe associated with a valid file, and can remove some repetitive code from the applicationthat is writing FILESTREAM data.Replacing the FILESTREAM data completelyWhen the FILESTREAM data is to be replaced completely, this is usually done usingthe streaming APIs. A client application can perform a write operation using either theSqlFileStream class (.NET) or the OpenSqlFilestream() Win32 API function. Whenthe FILESTREAM data file is opened for write access, the file pointer is at the beginningof the file. The write operation will write into a new disk file, and the reference on theFILESTREAM column will be updated to point to the new file when the file handle is103

Chapter 3: Accessing FILESTREAM Data from Client Applicationsclosed and transaction is committed. The old FILESTREAM data file will be removed byan asynchronous garbage collector thread.Note that any UPDATE triggers that exist on the target table will be fired when theFILESTREAM file handle is closed.Partial updates to FILESTREAM dataA partial update involves replacing a part of an existing FILESTREAM data file with newcontent, or appending to the end of an existing BLOB file. Unlike partial updates onVARBINARY(MAX) columns, attempting partial updates on FILESTREAM cells is a veryslow operation because a new disk file will be created with the new file content every timethe contents change. This adds a lot of additional I/O requirements. In effect, there isno such thing as a partial update to a FILESTREAM data file so, if your business or applicationrequires frequent small partial updates to the BLOB data, you might want to useVARBINARY(MAX) columns rather than FILESTREAM storage.If necessary to support a legacy application, partial updates can be simulated by openingthe FILESTREAM data file for read-write access and then moving the file pointer tothe appropriate position before writing the new data to the desired location, eitheroverwriting existing data, or appending the data to the end of the file.Any UPDATE triggers that exist on the target table will be fired when the FILESTREAMfile handle is closed.104

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsReading information from the FILESTREAM store.NET applications can access the FILESTREAM data file for read access using theSqlFileStream() class and specifying the READ flag.Win32 applications can use the OpenSqlFilestream() Win32 API function to obtain ahandle to the FILESTREAM data file associated with the FILESTREAM cell.Just like the write operation, read will also require a two-step approach; the clientapplication must retrieve the FILESTREAM transaction context and the logical pathname for the FILESTREAM cell before accessing the FILESTREAM data file associatedwith the column.Deleting the BLOB data stored in a FILESTREAMcolumnThe recommended way to delete the BLOB data stored in a FILESTREAM column isby using a T-SQL UPDATE statement, as was described in Chapter 2. The BLOB dataassociated with a FILESTREAM column is deleted when the value of the column is setto NULL.Deleting a row from a FILESTREAM-enabled tableJust like the INSERT operation, deleting a row from a FILESTREAM-enabled tablemust be done using T-SQL. This can be achieved directly, by using a DELETE query, orindirectly, in which a client library such as ADO.NET runs a DELETE query.105

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsLab 1: Reads, Writes and Partial UpdatesIt is time for us to write some fully functional sample code that reads and writesFILESTREAM data to perform the FILESTREAM operations below.1. Inserting a new record (with T-SQL) and storing FILESTREAM data in the column.2. Reading information from a FILESTREAM column.3. Performing partial updates on the BLOB data stored in a FILESTREAM column.In the following sections, we'll provide the sample code and step-by-step explanationsfor each of these operations in C# using the Managed API. The sample code for VB.NET(using the Managed API) and C++ (using the Win32 API) is supplied separately in thedownload. Though the syntax and code used in each of these programming languagesdiffer a little, the overall flow of the operations remains the same across all theseprogramming languages.Inserting a new record with FILESTREAM dataLet us see an example that reads the content of a disk file and stores it in theFILESTREAM data container. We will load the content of the Microsoft Mouse image filethat we used in Chapter 2.The first step is to create a connection to the desired target SQL Server 2008 instancethat hosts the database with the FILESTREAM-enabled table in which we want to storethe BLOB data, as shown in Listing 3-14.//References at the top of the fileusing System;using System.Data;using System.Windows.Forms;using System.Data.SqlClient;106

Chapter 3: Accessing FILESTREAM Data from Client Applicationsusing System.Data.SqlTypes;using System.IO;public void SaveItemImage(){byte[] txContext = null;string filePath = string.Empty;// Create a connection to the databasestring ConStr = ("Data Source=(local);" +"Initial Catalog=NorthPole;Integrated Security=True");SqlConnection con = new SqlConnection(ConStr);con.Open();Listing 3-14:Creating a connection to the SQL Server instance.Note that to access FILESTREAM data through the streaming API, you must connect toSQL Server using Integrated Security (Windows authentication). If you connect to SQLServer using SQL Server authentication, you will get an "Access Denied" error when youtry to open the FILESTREAM data file.Next, we insert a row with a dummy value into the FILESTREAM column in the Itemstable, using the T-SQL INSERT statement shown in Listing 3-15. Notice the OUTPUTclause, which returns the path name associated with the FILESTREAM column of thenewly inserted row, along with the current FILESTREAM transaction context.// Insert a record to the table and retrieve the PathName()// A dummy value is inserted to the ItemImage column to obtain a valid PathName()SqlCommand cmd = new SqlCommand();cmd.Connection = con;cmd.CommandText = "INSERT INTO Items " +"(ItemID, ItemNumber, ItemDescription, ItemImage)" +"OUTPUT GET_FILESTREAM_TRANSACTION_CONTEXT() AS txContext," +" inserted.ItemImage.PathName() AS filePath " +"SELECT NEWID(), 'A0001', 'Microsoft Mouse', 0x";107

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsSqlTransaction transaction = con.BeginTransaction("ItemTran");cmd.Transaction = transaction;SqlDataReader reader = cmd.ExecuteReader();if (reader.Read()){txContext = (reader["txContext"] as byte[]);filePath = reader["filePath"].ToString();}reader.Close();Listing 3-15:Inserting a record and retrieving the FILESTREAM file path and transaction context.Once we have the logical path to the FILESTREAM data file and the FILESTREAMtransaction context, we can write the binary data (the content of the disk file that storesthe image) into the FILESTREAM store, as shown in Listing 3-16.//Open the FILESTREAM data file for writingSqlFileStream fs = new SqlFileStream(filePath, txContext,FileAccess.Write);// Open the disk file for readingFileStream localFile = new FileStream("C:\\temp\\microsoftmouse.jpg",FileMode.Open, FileAccess.Read);//Copy data from disk file to FILESTREAM columnlocalFile.CopyTo(fs, 4096);Listing 3-16:Writing the content of a byte array into a FILESTREAM data file.And finally, we close the FILESTREAM file, commit the transaction, and close thedatabase connection to complete the exercise (Listing 3-17).//Close the fileslocalFile.Close();fs.Close();108

Chapter 3: Accessing FILESTREAM Data from Client Applications//Commit the transaction and close the connectioncmd.Transaction.Commit();con.Close();Listing 3-17:Cleanup code.As previously mentioned, the full listings for this exercise using C#, VB.NET, and C++can be found in the download files.Reading FILESTREAM dataNow we'll look an example that reads the BLOB data from the FILESTREAM store in theItems table and creates a disk file with the BLOB data.First, we need to create a connection to the SQL Server 2008 instance. The next step is towrite a SELECT query that retrieves the path name and FILESTREAM transaction contextof the desired record in the Items table (Listing 3-18). To make the code listing simpler,we will just select the TOP 1 row from the table. In real life, you might be querying for aspecific record based on a search criteria.public void ReadItemImage(){string filePath = string.Empty;byte[] txContext = null;string ConStr;ConStr = ("Data Source=(local);Initial Catalog=NorthPole"+ ";Integrated Security=True");SqlConnection con = new SqlConnection(ConStr);con.Open();SqlCommand cmd = new SqlCommand();cmd.Connection = con;cmd.CommandText = "SELECT TOP 1 " +" ItemImage.PathName() AS filePath, " +" GET_FILESTREAM_TRANSACTION_CONTEXT() AS txContext " +"FROM Items";109

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsSqlTransaction trn = con.BeginTransaction("ItemTran");cmd.Transaction = trn;SqlDataReader reader;reader = cmd.ExecuteReader();// Retrieve the FilePath() of the image fileif (reader.Read()){txContext = (reader["txContext"] as byte[]);filePath = reader["filePath"].ToString();}reader.Close();Listing 3-18:Retrieving the FILESTREAM logical path name and transaction context.We can now go ahead and open the FILESTREAM data file. The sample code inListing 3-19 opens a FILESTREAM data file for read operation.//Open the FILESTREAM data fileSqlFileStream fs = new SqlFileStream(filePath, txContext, FileAccess.Read);Listing 3-19:Opening a FILESTREAM data file for read operation.Next, we open the target file into which we will write the data being read from theFILESTREAM data file (Listing 3-20).//Create the target data fileFileStream localFile = new FileStream("C:\\temp\\MicrosoftMouse2.jpg",FileMode.CreateNew);Listing 3-20:Creating a disk file to store the BLOB data.In Listing 3-21, we read from the FILESTREAM data file and write into the disk file we justcreated.110

Chapter 3: Accessing FILESTREAM Data from Client Applications// Transferring data from FILESTREAM data file to disk filefs.CopyTo(localFile, 4096);Listing 3-21:Reading from the FILESTREAM data file and writing to the disk file.This should be followed by the cleanup operation to close the FILESTREAM file, committhe transaction, and close the database connection (Listing 3-17).Performing partial updatesA partial update is performed by opening the FILESTREAM data file for read-writeoperation, and then moving the file pointer to the desired location using the Seek()function, or to the end of the file to append new content.In order to verify the result of the operation manually, we will use some text data insteadof the binary image file from the previous examples. We will perform the actions below.1. Create a text file on a local folder.2. Open the FILESTREAM data file for write access and store the content of thetext file in it.3. Open the FILESTREAM data file for read-write access, and then move to the end ofthe file and write the content of the text file (created at Step 1) into the file again.There will be two copies of the text in the FILESTREAM data file.4. Open the FILESTREAM data file for read access, and then read the contents of thedata file and write it into a disk file.5. Open the disk file created at Step 4 and verify that it contains two copies of theoriginal text.111

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsTo start with, let's create a text file with some simple text data in C:\temp folder, andname the file original.txt (Figure 3-1).Figure 3-1:Creating a text file.Next, we need to write the content of this file into the FILESTREAM data store. Aspreviously, we need to connect to SQL Server, start a transaction, and retrieve theFILESTREAM column logical path name and the FILESTREAM transaction context.Next, we will read the content of original.txt and write it into the FILESTREAM file.We will open the FILESTREAM data file for write access and write the content of thesource file into it (Listing 3-22).//Open the FILESTREAM data file and store the content of source fileSqlFileStream fs = new SqlFileStream(filePath, txContext,FileAccess.Write);FileStream localFile = new FileStream("C:\\temp\\original.txt",FileMode.Open, FileAccess.Read);localFile.CopyTo(fs, 4096);localFile.Close();fs.Close();Listing 3-22:Reading the content of original.txt and writing it to the FILESTREAM data store.Next, we'll perform the partial update. In Listing 3-23, we open the file for read-writeaccess, move to the end of the file, and write the content of original.txt again.112

Chapter 3: Accessing FILESTREAM Data from Client Applications//Re-open the FILESTREAM data file and move to the end of filefs = new SqlFileStream(filePath, txContext, FileAccess.ReadWrite);fs.Seek(0, SeekOrigin.End);localFile = new FileStream("C:\\temp\\original.txt",FileMode.Open, FileAccess.Read);//Append the content of original.txt againlocalFile.CopyTo(fs, 4096);localFile.Close();fs.Close();Listing 3-23:Performing a partial update.We just performed a partial update on the FILESTREAM column. We will now verify thatthe information is stored as expected, by reading the contents of the FILESTREAM datafile and creating a new text file with the data. We can then open the text file and verifythat the partial update took place correctly (Listing 3-24).//Open the FILESTREAM data file for 'READ' and copy contents//into a disk filefs = new SqlFileStream(filePath, txContext, FileAccess.Read);localFile = new FileStream("C:\\temp\\copy.txt", FileMode.Create);fs.CopyTo(localFile, 4096);localFile.Close();fs.Close();Listing 3-24:Reading from the FILESTREAM data file and writing to a disk file.Let's now open the text file (copy.txt) and verify that it contains two copies of the textdata that was present in original.txt (Figure 3-2).113

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsFigure 3-2:Verifying that the copy contains two copies of the original text.We can see that the partial update has taken place as expected.Handling multiple FILESTREAM columns and rowsOne of the interesting features T-SQL programming offers is the ability to performset-based operations. A set operation allows you to process more than one row andcolumn in a single T-SQL query.However, when it comes to accessing FILESTREAM data through the streaming APIs,set-based operations are not applicable. In the examples discussed above, we saw howto read or write a single FILESTREAM cell using I/O streaming. If you want to read orwrite more than one FILESTREAM column or row, you must process each, one by one.The pseudocode in Listing 3-25 shows an approximate representation of how to readFILESTREAM data from more than one row and column./*Connect to databaseBegin a transactionRetrieve 10 rows from a table that has 3 FILESTREAM columnsSELECT TOP 10GET_FILESTREAM_TRANSACTION_CONTEXT(),FileStreamColumn1.PathName(),FileStreamColumn2.PathName(),FileStreamColumn3.PathName()114

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsFROM ItemsFor each row in the result setFor column 1 to 3Open the FILESTREAM data fileRead from FILESTREAM data FileDo something with the dataClose the FILESTREAM data fileNext columnNext rowCommit transactionClose database connection*/Listing 3-25:Pseudocode that shows how to read BLOB data from more than one row and column.Understanding the Logical Path to aFILESTREAM Data FileYou may have noticed by now that FILESTREAM deals with a lot of logical identifiers! Forexample, the FILESTREAM file share, configured while enabling the FILESTREAM feature(see Appendix A) is not mapped to a physical directory on the system; it is simply a logicalidentifier that SQL Server uses internally to point to the FILESTREAM data container. Ifyou try to explore the FILESTREAM share name by double-clicking on it from WindowsExplorer, you will realize that it does not open any folder, as would be expected for a"normal" file share.Just like the FILESTREAM file share, the logical path to a FILESTREAM data file (returnedby the PathName() function) is an internal identifier that SQL Server uses to point tothe disk file associated with a FILESTREAM data value. When a client application wantsto read or write FILESTREAM data through the streaming APIs, it needs to have a wayto exactly pinpoint the FILESTREAM value on a specific table, row, or column. ThePathName() function make this simpler. A client application can run a T-SQL query thatlocates the specific FILESTREAM value and calls the PathName() function to retrieve alogical identifier pointing to it.115

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsThe client application can then pass this value to the FILESTREAM API method callto open the NTFS data file represented by the logical identifier returned by thePathName() function.To illustrate this point, let's return to the example we first saw in Chapter 2, and create anew FILESTREAM-enabled database (Listing 3-26).CREATE DATABASE NorthPole ONPRIMARY (NAME = NorthPoleDB,FILENAME = 'C:\ArtOfFS\Demos\Chapter3\NorthPoleDB.mdf'), FILEGROUP NorthPoleFS CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter3\'NorthPole_fs')LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter3\NorthPole_log.ldf')Listing 3-26:Creating a FILESTREAM-enabled database.In Listing 3-27, we create a table with a FILESTREAM column and insert some data.-- Create "Items" tableCREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,[ItemNumber] VARCHAR(20),[ItemDescription] VARCHAR(50),[ItemImage] VARBINARY(MAX) FILESTREAM NULL)GO-- Insert the data to the tableINSERT INTO Items (ItemID, ItemNumber, ItemDescription, ItemImage)SELECTNEWID(), 'MS1001', 'Microsoft Mouse',CAST(bulkcolumn AS VARBINARY(MAX))FROM OPENROWSET(BULK 'C:\temp\MicrosoftMouse.jpg', SINGLE_BLOB) AS xListing 3-27:Creating a table with a FILESTREAM column and populating it with sample data.116

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsLet us now try to query the table and see the data we just inserted (Listing 3-28).SELECT * FROM Items/*ItemID Item# ItemDescription ItemImage------------ ------ --------------- -------------------E76D878F-0.. MS1001 Microsoft Mouse 0xFFD8FFE000104A4..*/Listing 3-28:Querying the Items table.The column ItemImage stores the FILESTREAM data we just inserted. To a T-SQLquery, the FILESTREAM column appears as a VARBINARY(MAX) column. However, thedata is stored in the FILESTREAM data container as a disk file. Every BLOB value in everyFILESTREAM column (in every non-NULL FILESTREAM field) has a disk file associatedwith it in the FILESTREAM data container.The logical path name refers to the disk file associated with each BLOB data value,and can be retrieved by using the new intrinsic function PathName(), as shown inListing 3-29. Note that PathName() is case sensitive. You will get an error if thecorrect case is not used when invoking this method.SELECTItemNumber,ItemImage.PathName() AS LogicalPathFROM Items/*Item# LogicalPath------ --------------------------------------------------MS1001 \\localhost\MSSQLSERVER\v1\

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsThe example above shows the logical path to the FILESTREAM data inserted into theItems table using the script in Listing 3-27 on the author's computer. If you run the codeon your computer, you might see a slightly different result.Table 3-1 dissects the various different component parts of the logical path name of aFILESTREAM data file.\\localhost\MSSQLSERVERComputer name.FILESTREAM share name.FILESTREAM version number (used internally).\v1\NorthPole\dbo\Items\\ItemImage\E76D878F-0B2D-48EE-906E-58349797D6ECOn SQL Server 2008 and 2008 R2 the versionnumber shows up as "v1." On SQL Server2012, the version number shows up as "v02"which is part of a more complex value such as"v02-A60EC2F8-2B24-11DF-9CC3-AF2E56D89593."Fully qualified object name (Database\Schema\Table).Column name.File identifier. This is the value of the ROWGUIDCOLof the current row.VolumeHint-HarddiskVolume2 This additional value is seen only in SQL Server 2012.Table 3-1:The components of the logical path name.Note that the logical path name is useful only to SQL Server; you cannot reliably identifythe disk file using this information. SQL Server installs a filter driver which translates thelogical path name to the correct disk file when you try to access it using the streamingAPI functions.118

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsThe PathName() function returns NULL if the FILESTREAM column is NULL. Thismeans that a FILESTREAM column with NULL value cannot be accessed using the Win32streaming APIs.Formatting PathName()When invoking the PathName() function, you can specify additional flags to formatthe output that is returned. PathName() takes an integer parameter 0, 1, or 2; thedefault is 0.The only difference is in the server name. Table 3-2 shows the way in which PathName()formats the server name with the different flags that it supports.PathName()PathName(0)PathName(1)PathName(2)\\JACOB-LAPTOP\\jacob-laptop\\jacob-laptop.beyondrelational.comConverts the server name toNetBIOS format.Does not perform anyconversion.If the server is part of adomain, returns the fullyqualified domain name.If the server is not part of adomain, does not performany conversion, as forPathName(1).Table 3-2:Pathname parameters.119

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsIn most cases this formatting may not be helpful for us as database administrators ordevelopers. According to Srini Acharya, a Senior Program Manager in the RelationalEngine development group at Microsoft SQL Server, this formatting is intended for theuse of testing and support engineers at Microsoft.PathName() and ROWGUIDCOLIn Chapter 1, we stated that a FILESTREAM column could be created only on a table thathas one (and only one) UNIQUEIDENTIFIER column, with the ROWGUIDCOL attribute,which indicates that the value in that column can be used to uniquely identify each rowin the table. However, since the database engine does not do anything to enforce thisuniqueness, the NOT NULL and UNIQUE attributes are also required on this column.The value stored in the column marked as ROWGUIDCOL can even be accessed withoutknowing the name of the column. For example, if a table named Customers exists, andthe UNIQUEIDENTIFIER column, with the ROWGUIDCOL attribute, is called CustomerID,then values from the CustomerID column can be retrieved using SELECT $ROWGUIDFROM Customers.Not only must this ROWGUIDCOL column of the table exist, it must also be visible tothe query that uses the PathName() function. If we create a view or common tableexpression (CTE) that does not include the ROWGUIDCOL column, executing thePathName() function on the view or CTE will generate an error (Books Online clearlyexplains the reason for this restriction).Listing 3-30 creates a VIEW of the Items table, but does not include the ROWGUIDCOL inthe view.120

Chapter 3: Accessing FILESTREAM Data from Client Applications-- Create a VIEWCREATE VIEW ItemInfo ASSELECT ItemNumber, ItemImage FROM ItemsGO-- SELECT from the VIEWSELECTItemNumber,ItemImage.PathName()FROM ItemInfo/*-- OutputMsg 5539, Level 16, State 1, Line 3The ROWGUIDCOL column associated with the FILESTREAM being used is not visiblewhere method PathName is called.*/Listing 3-30:Using a view to show that the ROWGUIDCOL must be visible to the PathName() query.The ItemInfo view does not include the ROWGUIDCOL column, so a call to thePathName() on any of the columns will generate an error. To fix this problem, we mustinclude the ItemID column (which is marked as ROWGUIDCOL) in the view, as shown inListing 3-31.ALTER VIEW ItemInfo ASSELECTItemNumber,ItemImage,ItemID --This is needed for PathName()FROM ItemsListing 3-31:Adding the ROWGUIDCOL column to the view.Similarly, the query on the CTE in Listing 3-32 will fail because the ROWGUIDCOL is notvisible to the PathName() function.121

Chapter 3: Accessing FILESTREAM Data from Client Applications;WITH cte AS (SELECTItemNumber,ItemImageFROM Items)SELECTItemNumber,ItemImage.PathName()FROM cte/*-- OutputMsg 5539, Level 16, State 1, Line 1The ROWGUIDCOL column associated with the FILESTREAM being used is not visible*/Listing 3-32:Using a CTE to show that the ROWGUIDCOL must be visible to the PathName() query.Again, because the CTE does not include the ROWGUIDCOL column, a call to thePathName() on any of the columns will generate an error. For the same reason, the queryin Listing 3-33 will also fail.SELECTItemNumber,ItemImage.PathName()FROM (SELECTItemNumber,ItemImageFROM Items) a/*-- OutputMsg 5539, Level 16, State 1, Line 1The ROWGUIDCOL column associated with the FILESTREAM being used is not visiblewhere method PathName is called.*/Listing 3-33:The ROWGUIDCOL must be visible to the PathName() query.The SELECT statement does not include the ROWGUIDCOL column, so the PathName()query generates an error.122

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsSqlFileStream Class ReferenceThe SqlFileStream class represents the FILESTREAM data file associated with theBLOB data stored in a FILESTREAM cell. This class exposes the BLOB as a sequenceof bytes.SqlFileStream is derived from System.IO.Stream and resides in the System.Data.Sqltypes namespace. To use the SqlFileStream class, you need a referenceto assembly System.Data in System.Data.dll. (SqlFileStream was added to theFramework with.NET Framework 3.5 SP 1 or later, which also includes 2.0 SP 2 and3.0 SP 2.)Instantiating a SqlFileStream objectSqlFileStream provides two constructors that you can use to instantiate an object withdifferent levels of control over the file being accessed. We have seen an example of thebasic constructor earlier in this chapter. Listing 3-34 shows the simple constructor that wesaw earlier.// Open and read file using SqlFileStream ClassSqlFileStream fs = new SqlFileStream(filePath, txContext, FileAccess.Read);Listing 3-34:C# code that opens a FILESTREAM data file for read access.• filePath – This is the logical path to the physical file associated with theFILESTREAM data value. The logical path is retrieved by a call to the PathName()function on the FILESTREAM column on the specific row in the table.• txContext – This is the FILESTREAM transaction context associated with the currenttransaction. Note that you need an active transaction even to read FILESTREAM datawhen accessing through I/O streaming.123

Chapter 3: Accessing FILESTREAM Data from Client Applications• FileAccess – This parameter specifies the required file access mode. The possiblevalues are:• FileAccess.Read – Opens the FILESTREAM data file for reading. The file pointeris placed at the beginning of the file.• FileAccess.Write – Opens the FILESTREAM data file for writing.• FileAccess.ReadWrite – Opens the FILESTREAM data file for reading andwriting. The file pointer is placed at the beginning of the file. You can use theSeek() method to move the file pointer to specific locations in the file and makepartial updates, or to move to the end of the file to append content.SqlFileStream provides another version of the constructor that gives morecontrol over the FILESTREAM data file being opened. Listing 3-35 shows an examplethat demonstrates the advanced SqlFileStream constructor.// Open and read file using SqlFileStream ClassSqlFileStream fs = new SqlFileStream(filePath, txContext, FileAccess.Read,FileOptions.SequentialScan, 0);Listing 3-35:Using the advanced FILESTREAM constructor.This version of the SqlFileStream constructor allows you to specify more optionswhen opening the FILESTREAM data file. The additional parameters enable you tooptimize the FILESTREAM operations.The first three parameters are the same as for the basic FILESTREAM constructor whichwe discussed earlier. The FileOptions parameter enables you to specify additional flagsthat inform the system about your intended file access pattern and optimize the operationsaccordingly. For example, Listing 3-35, above, tells the system that you intend to usethe file sequentially; the system uses this information to optimize FILESTREAM accessbecause it knows in advance that you will not move the file pointer backwards or jumpinto a random location in the file.124

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsThe other options that you can specify are:• FileOptions.None – Indicates that you do not want to specify any additional flags.• FileOptions.WriteThrough – Instructs the system to send writes directly to disk,bypassing any intermediate caches.• FileOptions.Asynchronous – Informs the system that you intend to use the filefor asynchronous reading and writing.• FileOptions.RandomAccess – Informs the system that you might access thefile randomly. The system will use this information to optimize file caching forrandom access.• FileOptions.SequentialScan – Informs the system that you will use the filesequentially. When this flag is specified, the system optimizes the file operations onthe basis that you will not access the file backwards or jump into a random locationin the file.Note that using any option other than those listed above will result in anArgumentOutOfRangeException error.The last parameter in Listing 3-35 specifies the default file allocation size while creatingthe file. You can specify 0 to instruct the system to use the default size specified by theNTFS volume.Note, also, that if an error occurs when calling the constructor, the current transactionshould be rolled back.SqlFileStream exposes a number of methods that enable you to manipulate theFILESTREAM data described below.• BeginRead()– Begins an asynchronous read operation.• BeginWrite()– Begins an asynchronous write operation.125

Chapter 3: Accessing FILESTREAM Data from Client Applications• Close()– Closes the FILESTREAM data file. This should be done before thetransaction is committed or rolled back.• CopyTo()– Copies all the bytes from the current stream to a destination stream.• EndRead()– Waits for a pending asynchronous read to complete.• EndWrite()– Ends an asynchronous write operation.• Flush()– Clears buffers and forces the data to be written to the FILESTREAM data file.• Read()– Reads a sequence of bytes from the FILESTREAM data file.• ReadByte()– Reads a byte from the FILESTREAM data file.• SetLength()– Sets the length of the current stream.• Write()– Writes a sequence of bytes into the FILESTREAM data file.• WriteByte()– Writes a byte into the FILESTREAM data file.For additional reference, see the MSDN documentation at: http://brurl.com/fs5.OpenSqlFilestream API ReferenceOpenSqlFilestream() is a Win32 API function exposed by the SQL Server NativeClient that gives you access to the FILESTREAM data file. This function returns a Win32file handle pointing to the FILESTREAM data file associated with a given FILESTREAMdata value.The OpenSqlFilestream API is exposed from the dynamic link library sqlncli.dll.The declaration of the various symbols for accessing the OpenSqlFilestream APIfunction is found in header file sqlncli.h, and the static library that you need to linkis sqlncli.lib. These files can be found in the Include and Lib folders of your SDKdirectory respectively.126

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsThe default location is C:\Program Files\Microsoft SQL Server\100\SDK\Include, but you might find it in a different location, depending upon where youinstalled SQL Server 2008.The code snippet in Listing 3-36 shows an example that opens a FILESTREAM data fileusing the OpenSqlFilestream API function.//Get a handle to the FILESTREAM data using OpenSqlFileStream APIHANDLE srcHandle = INVALID_HANDLE_VALUE;srcHandle = OpenSqlFilestream ( dstFilePath, SQL_FILESTREAM_READ, 0,transactionToken,cbTransactionToken, 0);Listing 3-36:Opening a FILESTREAM data file for READ access with C++ using Win32 APIOpenSqlFileStream().The first argument (dstFilePath) is the logical path to the FILESTREAM data file,retrieved by a call to the PathName() function on the FILESTREAM column. The secondargument specifies the file access level required. This argument can take one of the valueslisted below.• SQL_FILESTREAM_READ – Opens the FILESTREAM data file for reading. The filepointer is placed at the beginning of the file.• SQL_FILESTREAM_WRITE – Opens the FILESTREAM data file for writing. The filepointer is placed at the beginning of the file. A write operation will overwrite theexisting content in the file.• SQL_FILESTREAM_READWRITE – Opens the FILESTREAM data file for reading andwriting. The file pointer is placed at the beginning of the file. You can use the Seek()method to move the file pointer to specific locations in the file and make partialupdates, or to move to the end of the file to append content.127

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsThe next argument in Listing 3-36 specifies additional flags that help the system tooptimize the requested file operation. The different values this parameter can take aredescribed below.• SQL_FILESTREAM_OPEN_NONE – Indicates that you do not want to specify anyadditional flags.• SQL_FILESTREAM_OPEN_FLAG_NO_WRITE_THROUGH – Instructs the system to sendwrites directly to disk, bypassing any intermediate caches.• SQL_FILESTREAM_OPEN_FLAG_ASYNC – Informs the system that you intend to usethe file for asynchronous reading and writing.• SQL_FILESTREAM_OPEN_FLAG_RANDOM_ACCESS – Informs the system that youmight access the file randomly. The system will use this information to optimize filecaching for random access.• SQL_FILESTREAM_OPEN_FLAG_SEQUENTIAL_SCAN – Informs the system thatyou will use the file sequentially. When this flag is specified, the system optimizes thefile operations on the basis that you will not access the file backwards or jump into arandom location in the file.• SQL_FILESTREAM_OPEN_FLAG_NO_BUFFERING – Requests the system to open thefile with no buffering support.The next argument in Listing 3-36 (transactionToken) is the FILESTREAM transactioncontext retrieved by calling T-SQL function GET_FILESTREAM_TRANSACTION_CONTEXT(). This is followed by another parameter, cbTransactionToken, whichspecifies the size of the buffer that stores the FILESTREAM transaction context.The final parameter in Listing 3-36 specifies the initial allocation size of the data filein bytes. This parameter is ignored when the file is opened for READ operation. If thisparameter is NULL, the system will use the default allocation size.128

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsNote that FILESTREAM data can be accessed using the OpenSqlFilestream() functiononly if the application connects to SQL Server using Windows authentication.If OpenSqlFilestream()succeeds, it returns a valid handle to the FILESTREAM datafile associated with the logical path name passed into the function as the first parameter.If the function fails, it returns INVALID_HANDLE_VALUE. The handle value returnedby this function can be used to manipulate the FILESTREAM data associated withthe handle. The Win32 API functions below can be used on the handle returned byOpenSqlFilestream() function.• ReadFile() – Reads information from the FILESTREAM data file.• WriteFile() – Writes information into the FILESTREAM data file.• TransmitFile() – Used to transmit FILESTREAM data over a socket.• SetFilePointer() – Moves the file pointer to a desired location in the diskfile associated with the handle. The next read/write operation will start from thenew location.• SetEndOfFile() – Truncates the file at the current file pointer position.• FlushFileBuffers() – Writes the data in all the buffers associated with a file intothe disk.• CloseHandle() – Closes the FILESTREAM file handle. If the table has UPDATEtriggers, they will be fired when the FILESTREAM file handle is closed.Note that the FILESTREAM data handle cannot be used with any Win32 APIs other thanthose listed above. If you pass the handle to any other Win32 API function, you will get anERROR_ACCESS_DENIED error.The FILESTREAM file handle should be closed by calling CloseHandle() before the currenttransaction is committed or rolled back.129

Chapter 3: Accessing FILESTREAM Data from Client ApplicationsSummaryWhen accessing FILESTREAM data through the Win32 or .NET interfaces exposed bySQL Server, you can get better performance in most cases because SQL Server memoryis not used for processing the BLOB data. In addition, your code can benefit from thecaching and streaming capabilities of the NTFS. Any.NET applications can use theSqlFileStream() class and Win32 applications can use the OpenSqlFilestream()API to gain streaming access to the FILESTREAM data.A FILESTREAM transaction context, and the logical path name to the FILESTREAMdata file to be processed, are two basic ingredients that you will need prior to accessing aBLOB value stored in the FILESTREAM cell. Once an explicit SQL Server transaction hasbeen started, the FILESTREAM transaction context can be retrieved by calling the T-SQLfunction GET_FILESTREAM_TRANSACTION_CONTEXT(). The logical path name to theFILESTREAM data file can be retrieved by calling the new case-sensitive intrinsic functionPathName() on the FILESTREAM column.OpenSqlFilestream() returns a file handle to the FILESTREAM data file, which canbe used to read and write information stored in the FILESTREAM data file. The .NETclass SqlFileStream encapsulates the FILESTREAM data file and provides a numberof methods that allow you to read and write information into the FILESTREAM data file.The FILESTREAM file handle, or the SqlFileStream object, should be closed before thetransaction is committed or rolled back.130

Chapter 4: FILESTREAM with EntityFramework and LINQ to SQLIn this chapter, we run through some labs that demonstrate how applications that employADO. NET Entity Framework (EF) or LINQ to SQL in their data access layer can interactwith the FILESTREAM feature.The focus will be on the FILESTREAM end of the equation, rather than on the specifics ofthe given data access framework or technology, and the chapter adopts a highly practicalapproach, consisting of the worked labs below.• Lab 2: Using FILESTREAM data with EF. This lab actually comprises twoseparate examples:• Lab 2a: a simple application that adopts the "default" EF approach of usingT-SQL to access the FILESTREAM data• Lab 2b: a custom approach that allows continued used of streaming I/O accesswith SqlFileStream.• Lab 3: Using FILESTREAM data with LINQ to SQL.As you can see, there is greater emphasis in this chapter on EF than on LINQ to SQL. Thelatter has been characterized by Microsoft as being for "strongly-typed LINQ access to SQLServer, for rapidly developed applications" or, less formally by members of the Microsoftdeveloper community, as "the quick and dirty way to do it." In contrast, ADO.NET EntityFramework "provides strongly-typed LINQ access for applications requiring a more flexibleObject Relational mapping, across Microsoft SQL Server and third-party databases."131

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLIf it sounds like LINQ to SQL plays second fiddle to EF's lead violin, then I'd say that thisis a fair reflection of community opinion and of the development of the two technologiesby Microsoft; the former is being "supported" on an ongoing basis, the latter is beingactively developed and improved. Nevertheless, LINQ to SQL has an endearing simplicitythat continues to make it popular among its users, hence its inclusion here.Before we get started, a note of sensible caution: the labs in this chapter demonstrate aparticular way to accomplish tasks with these technologies; they don't attempt to coverall possible approaches, nor do we guarantee that the approach shown will be the bestone for your specific environment. Use the labs as a starting point, then adapt them asnecessary for your needs, and bear in mind that you'll also need to rework the code totake into account the specific nature of your security architecture, and so on.Lab 2: FILESTREAM and Entity FrameworkADO.NET EF has become one of the most widely discussed and used object relationalmapping (ORM) frameworks for .NET, and many questions have been raised on variousforums regarding the use of EF with FILESTREAM columns.As we have seen, one of the key benefits of FILESTREAM storage is that the data can beaccessed using Win32 or Managed APIs, which provide streaming performance. ThoughEF allows you to access FILESTREAM columns, the access is provided through theT-SQL interface. At the time of writing this, EF does not provide a built-in way to accessFILESTREAM data through Win32 or Managed APIs, so you might want to reconsideryour decision to use EF to access FILESTREAM data and instead write some custom codethat performs the FILESTREAM operations within an EF project.Our labs demonstrate both approaches. Lab 2a creates a basic application that uses EF toaccess FILESTREAM data directly, and hence access the FILESTREAM data using T-SQL.Lab 2b shows a technique that allows us to continue to use the SqlFileStream class toaccess the FILESTREAM data through EF.132

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLEach lab will follow the same basic structure.1. Create a stored procedure through which EF will access the FILESTREAM data.2. Create a console application project.3. Add an Entity Data Model Object to the project.4. Create the ORM.5. Insert a row with FILESTREAM data into the table.6. Retrieve rows along with FILESTREAM data.In both cases, the goal is to help you to get started with writing EF applications that readand write FILESTREAM data. We will examine only a few basic features offered by ADO.NET EF, a more detailed discussion of which is beyond the scope of this book.Rebuilding the sample databaseBefore we get started with the labs, we need to create a new sample database. Run theT-SQL script in Listing 4-1 to rebuild the NorthPole sample database.USE masterGO-- ----------------------------------------------------- If the sample database exists, drop it and re-create-- ---------------------------------------------------IF DB_ID('NorthPole') IS NOT NULLDROP DATABASE NorthPoleGO133

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLCREATE DATABASE NorthPole ONPRIMARY (NAME = NorthPoleDB,FILENAME = 'C:\ArtOfFS\Demos\Chapter4\NorthPoleDB.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter4\NorthPole_fs')LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter4\NorthPole_log.ldf')GOListing 4-1:Rebuilding the sample NorthPole database.Next, create a table to store the FILESTREAM data using the code in Listing 4-2.USE NorthPoleGOCREATE TABLE [dbo].[Items]([ItemID] INT IDENTITY PRIMARY KEY,[ItemGuid] UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,[ItemNumber] VARCHAR(20),[ItemDescription] VARCHAR(50),[ItemImage] VARBINARY(MAX) FILESTREAM NULL)GOListing 4-2:Creating a table for the FILESTREAM data.Lab 2a: A simple EF application using T-SQL accessIn this lab we'll create a simple application that will treat the FILESTREAM column asa plain VARBINARY(MAX) column. This is the default way Entity Framework handlesFILESTREAM data. For applications that seek to leverage the streaming benefits ofFILESTREAM, this isn't a suitable approach, so you may want to review the alternativeapproach in Lab 2b.134

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLCreate the stored procedure interfaceOur first step is to create a stored procedure for adding new items into the table (Listing4-3). This stored procedure will be invoked from the EF project.CREATE PROCEDURE [dbo].[SaveItem] (@ItemNumber VARCHAR(20),@ItemDescription VARCHAR(50),@ItemImage VARBINARY(MAX)) ASINSERT INTO dbo.Items(ItemGuid,ItemNumber,ItemDescription,ItemImage)SELECTNEWID(),@ItemNumber,@ItemDescription,@ItemImageRETURN SCOPE_IDENTITY()Listing 4-3:Creating a stored procedure to add items to the table.Note that this procedure accepts item information including the image, which will goto the FILESTREAM column of the table. Because we are using the T-SQL interface forwriting into the FILESTREAM column, a VARBINARY(MAX) variable is used. Rememberthat the FILESTREAM columns appear as VARBINARY(MAX) data type values whenaccessing them from T-SQL.135

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLCreating a console application projectThe next step is to create a simple application. For simplicity, we will create a consoleapplication. Open Visual Studio, select File | New | Project, and then select ConsoleApplication. You might want to select your favorite .NET programming language (VBor C#) at this stage. The programming language will not make much difference becausethere is not much code in this lab. We present C# code in this chapter; the sample codefor VB is supplied separately in the download.Adding an Entity Data ModelThe next step is to add an ADO.NET Entity Data Model to the project. Right-click onyour project in Solution Explorer, and select Add | New Item. Then select ADO.NETEntity Data Model (Figure 4-1).Figure 4-1:Add New Item dialog box.136

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLWe will name the Entity Data Model NorthPoleModel.edmx. Click OK to proceed to theEntity Data Model Wizard (Figure 4-2).Figure 4-2:Generating the model.Select Generate from database, because we have already created the database neededfor this lab. The wizard will generate EF wrapper classes for the selected database objectswhen completed. Next, we choose the data connection (Figure 4-3).137

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLFigure 4-3:Choosing the data connection.You may remember that, for accessing FILESTREAM through Win32 or Managed APIs,Windows authentication was necessary. However, as mentioned earlier, EF accessesFILESTREAM data through T-SQL only. Therefore, when you enter your databasecredentials, you can use either Windows authentication or SQL Server authentication.The next step allows you to select the database objects you wish to include in your datamodel (Figure 4-4). We'll select the Items table and the stored procedure SaveItem, andname the data model NorthPoleModel.138

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLFigure 4-4:Choosing the database objects.Clicking Finish saves the information and opens the Model Designer window.Configuring object relational mappingThe next step is to review and configure the mapping of properties and functions. TheEntity Data Model designer creates data model objects based on the selections we madeusing the wizard. In our case, we selected the Items table, so you will see a model objectnamed Item that represents each row in the Items table (Figure 4-5).139

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLFigure 4-5:The Item data model object.Click on the Item object to open the Mapping Details window (Figure 4-6). If the windowdoes not open automatically on your system, right-click the Item entity and select TableMapping from the shortcut menu.Figure 4-6:The Mapping Details window.140

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLTake a look at the mapping of the ItemImage column; it is a FILESTREAM column butit is mapped to the Binary data type. EF does not understand FILESTREAM values. It seesFILESTREAM columns as VARBINARY(MAX) data type values.The next step is to create a mapping for the stored procedure we created for saving items.Click on the Map Entity to Functions button on the left side of the Mapping Detailswindow, and select the stored procedure from the drop-down list provided for the insertoperation (Figure 4-7).Figure 4-7:Creating a mapping for the stored procedure.With this, we are done with the mapping and it is time to go ahead and write the codethat adds a new item into the table using EF.141

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLAdding a row to the tableHaving completed the object relational mapping, it is quite easy to perform operationsthat manipulate the data stored in the associated tables. The C# code in Listing 4-4 adds anew record into the Items table. One of the columns being populated is a FILESTREAMcolumn. The content of a disk file is read and copied into the ItemImage property of theEF class Item, which represents the FILESTREAM column in the table.static void Main(string[] args){Item itm = new Item();itm.ItemNumber = "IT001";itm.ItemDescription = "Microsoft Mouse";itm.ItemImage =System.IO.File.ReadAllBytes("c:\\temp\\microsoftmouse.jpg");NorthPoleEntities npe = new NorthPoleEntities();npe.AddObject("Items", itm);try{npe.SaveChanges();}catch (Exception ex){Console.Write(ex.Message);}}npe.Dispose();Listing 4-4:Adding a new record to the Items table using EF.Note that there is nothing FILESTREAM-specific in this code. Though the column beingpopulated is a FILESTREAM column, the code treats it as a VARBINARY(MAX) column.142

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLRetrieving information from the databaseNow, let us see how to retrieve data from FILESTREAM columns using EF. Just as wesaw in the previous example, there is no FILESTREAM-specific code involved in this stepeither. EF treats FILESTREAM columns as VARBINARY(MAX) columns, and thereforethe code that we will write will be the same for FILESTREAM and VARBINARY(MAX)columns.The code in Listing 4-5 searches for a record with ItemNumber IT001, and returns anItem object that contains the information about the given record. The last part of thecode reads the FILESTREAM value stored in the ItemImage column and writes to a diskfile. Add the code to the Main() method in the Program class. You must also import theSystem.Data.Objects namespace at the top of your code file.NorthPoleEntities npe = new NorthPoleEntities();foreach (Item itm in npe.Items.Where("it.ItemNumber == IT001")){System.IO.File.WriteAllBytes("c:\\temp\\mm.jpg", itm.ItemImage);}npe.Dispose();Listing 4-5:Retrieving BLOB values stored in a FILESTREAM column.That completes Lab 2, in which we created a very simple console application thatdemonstrated how to read and write FILESTREAM values using EF. Run theapplication, or debug and step through the code, to see our code in action.143

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLLab 2b: Using SqlFileStream and Entity FrameworktogetherIn this lab, we'll create an application similar to the one from Lab 2a but one that can stilluse the SqlFileStream class to access the FILESTREAM data, while at the same timemaintaining the object oriented features of Entity Framework for accessing the relationaldata. This is done by retrieving the structured data using EF's mapping capabilities andthe FILESTREAM data using custom code developed in an application's data access layer.Modifying the stored procedure interfaceTo help support the needs of the application, we will create two additional storedprocedures in the sample database.The first, Items_Insert, effectively replaces the SaveItem stored procedure fromthe previous lab. It is used to create a new row in the Items table and ensures that theFILESTREAM data file for the row is created by setting a dummy value (Listing 4-6).CREATE PROCEDURE [dbo].[Items_Insert]@ItemGuid uniqueidentifier,@ItemNumber nvarchar(20),@ItemDescription nvarchar(50)ASBEGIN-- SET NOCOUNT ON added to prevent extra result sets from-- interfering with SELECT statements.SET NOCOUNT ON;INSERT INTO [Items] (ItemGuid, ItemNumber, ItemDescription, ItemImage)VALUES (@ItemGuid, @ItemNumber, @ItemDescription, 0x);ENDSELECT SCOPE_IDENTITY() AS NewItemID;Listing 4-6:Creating a stored procedure to create a row with a dummy FILESTREAM value.144

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLThe stored procedure takes three arguments that are mapped directly to fields in theItems table. There is no input parameter for the ItemImage field because we provide adefault dummy value of 0x (zero-length binary value). This ensures that a FILESTREAMdata file will be created for the new record, but the file is a zero-length (empty) file.We also select the value of the SCOPE_IDENTITY() function as a new field with nameNewItemID. Because ItemID is an IDENTITY column, we will use the value of that fieldto update the ItemID scalar property value of the stored Item object instance with thevalue assigned by the database engine.The second new stored procedure, GetSqlFileStreamInfo (Listing 4-7), will return thevalues necessary to create a SqlFileStream object instance for the FILESTREAM dataof a given item (identified by the ItemID parameter). The stored procedure will returna single row containing the path name of the FILESTREAM data file, the transactioncontext of a previously started transaction and the length of the FILESTREAM data file.CREATE PROCEDURE [dbo].[GetSqlFileStreamInfo]@ItemID intASBEGIN-- SET NOCOUNT ON added to prevent extra result sets from-- interfering with SELECT statements.SET NOCOUNT ON;END-- Insert statements for procedure hereSELECT I.ItemImage.PathName() AS FilePath,GET_FILESTREAM_TRANSACTION_CONTEXT() AS "Context",DATALENGTH(I.ItemImage) AS "Size"FROM Items IWHERE I.ItemID = @ItemIDListing 4-7:Creating a stored procedure to retrieve the SqlFileStream constructor values.145

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLCreating a console application projectThe next step is to create a simple application. We will create another console application.Open Visual Studio, select File | New | Project, and then select Console Application. Youmight want to select your favorite .NET programming language (VB or C#) at this stage.The programming language will not make much difference because there is not muchcode in this lab. We present C# code in this chapter; the sample code for VB is suppliedseparately in the download.Provide a suitable name for your project and solution, such as FILESTREAM Chapter 4Lab 2b.Adding an Entity Data ModelThe next step is to add an ADO.NET Entity Data Model to the project. Right-click onyour project in Solution Explorer, and select Add | New Item. Then select ADO.NETEntity Data Model (Figure 4-8).Figure 4-8:The Add New Item dialog box.146

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLWe will name the Entity Data Model NorthPoleModel.edmx. Click OK to proceed to theEntity Data Model Wizard (Figure 4-9).Figure 4-9:Generating the model.Select Generate from database, because we have already created the database neededfor this lab. The wizard will generate EF wrapper classes for the selected database objectswhen completed. Next, we choose the data connection (Figure 4-10).In Lab 2a, it was noted that using SQL Server authentication to connect to the databasewas acceptable because we would only access the FILESTREAM data using T-SQL. Inthis lab, we must use Windows authentication because the application will access theFILESTREAM data using the Managed APIs.147

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLFigure 4-10:Choosing the data connection.The next step allows us to select the database objects we wish to include in our datamodel (Figure 4-11). We'll select the Items table and the stored procedures Insert_Items and GetSqlFileStreamInfo, and name the data model NorthPoleModel.148

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLFigure 4-11:Choosing the database objects.Clicking Finish, saves the information and opens the Model Designer window.Configuring object relational mappingWe will now make some modifications to the default object relational mapping that wascreated. Most importantly, we will remove the mapping of the ItemImage field so that itwill not be included in the standard SELECT statements generated by Entity Framework.Right-click on the ItemImage scalar property in the Item entity on the entity diagramand click Delete. This effectively removes the ItemImage field from the mapping.149

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLNext, we will map the Items_Insert stored procedure so that it becomes the InsertFunction for the Items table. In the Mapping Details window click on the Map Entity toFunctions button, found on the right-hand side of the window (Figure 4-12).Figure 4-12:The Map Entity to Functions button.Click on the row and select Items_Insert from the drop-downlist. The parameters for the function are mapped correctly by default. In the ResultColumn Bindings node, type NewItemID (the name of the field in the return set of ourItems_Insert stored procedure) and press Enter. The designer will automatically andcorrectly map this value to the ItemID scalar property.Finally, we will map the GetSqlFileStreamInfo stored procedure to a function in thedata access layer. To add a function, right-click empty space in the Entity Model designerand click Add > Function Import. The Add Function Import dialog box (Figure 4-13)opens. Select the GetSqlFileStreamInfo stored procedure from the Stored ProcedureName drop-down list. Create a suitable name for the function, which in our example wewill keep the same as the stored procedure name.Our stored procedure returns a single row containing three columns. The structure ofthat row does not map to our Item entity though, so we will let the Entity Frameworkdesigner create a new class whose scalar properties map to the fields returned by thestored procedure.150

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLFirst, click the Get Column Information button, then click the Create NewComplex Type button. This will automatically change the Returns a CollectionOf option group value to Complex. The designer will assign a default name ofGetSqlFileStreamInfo_Result to the new complex type.Figure 4-13:The Add Function Import dialog box.You can see the effects of these steps in the Model Browser window.151

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLAdding a row to the tableWe will create a new method in the Program.cs file to insert a new item in the database.The code for this method is shown in Listing 4-8. This code will go in the Program class.static int InsertItem(){Item NewItem = new Item();NewItem.ItemNumber = "IT002";NewItem.ItemGuid = System.Guid.NewGuid();Console.WriteLine("New Item GUID: {0}",NewItem.ItemGuid.ToString());NewItem.ItemDescription = "Microsoft Mouse";using (NorthPoleEntities e = new NorthPoleEntities()){bool OpenedConn = false;if (e.Connection.State != System.Data.ConnectionState.Open){e.Connection.Open();OpenedConn = true;}try{// Start a transaction explicitly to ensure adding the row and// creating the BLOB are part of the same transaction// System.Transactions.TransactionScope can be used if requiredusing (System.Data.Common.DbTransaction tran =e.Connection.BeginTransaction()){e.Items.AddObject(NewItem);// Because this uses a stored procedure (instead of// generated T-SQL)// this will create a FILESTREAM data file// If adding records using a stored proc is not desired,// then this can be set as a DEFAULTe.SaveChanges();152

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQL// Now read the item using SqlFileStream, leveraging// the stored procedure that's mapped as a functionGetSqlFileStreamInfo_Result info =e.GetSqlFileStreamInfo(NewItem.ItemID).ToList()[0];// If the path is NULL, a BLOB file does not exist// and a BLOB must be created firstusing (SqlFileStream fs = new SqlFileStream(info.FilePath, info.Context, FileAccess.Write,FileOptions.WriteThrough, 0L))using (FileStream local = new FileStream("C:\\temp\\MicrosoftMouse.jpg",FileMode.Open, FileAccess.Read)){local.CopyTo(fs);}tran.Commit();}}finally{// If the connection was opened by this procedure, close itif (OpenedConn) e.Connection.Close();}}}return NewItem.ItemID;Listing 4-8:InsertItem method.This code first creates a new instance of the Item class and, for demonstration purposes,assigns values to the ItemNumber, ItemGuid and ItemDescription properties. Then,an instance of the Entity Framework Object Context (NorthPoleEntities class) iscreated. If the object context does not have an open connection to the database, we openthe connection and keep track of the fact that we did so (this means that we can properlyclose it when the method ends).153

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLAs we have already discussed, access to FILESTREAM data must occur in the context ofa SQL Server transaction, so we start a transaction using the BeginTransaction()method on the Connection object associated with the object context. As part of thattransaction, we add the newly created Item object to the object context and save thechanges to the database.Saving the changes to the database causes the ItemID property of the NewItem objectto be set to the value assigned by the database engine (recall the mapping we modifiedearlier). We use this value as the parameter for the GetSqlFileStreamInfo method.Even though we know that the GetSqlFileStreamInfo method can only return asingle object instance, because the SELECT statement in the stored procedure filters bythe table's primary key (ItemID), the framework doesn't have that knowledge and willreturn a set of GetSqlFileStreamInfo_Result objects. We will simply take the firstand only item in that list.Using the values in the GetSqlFileStreamInfo_Result object, we can create a newSqlFileStream instance for write access. From the local file system, we open a sampleimage file using standard .NET file I/O APIs. We then copy the bytes from the local file tothe SqlFileStream.Finally, we commit the transaction and, if we opened it, close the database connection.Retrieving information from the databaseTo end this lab, we are going to add some code that can retrieve a specific item (row) fromthe database. We will use the Entity Framework object relational mapping to retrieve therelational data, but use streaming I/O access to work with the FILESTREAM data.Listing 4-9 shows the ReadItem method to be added to the Program class. This coderequires that you import two namespaces in your code file: System.Data.SqlTypesand System.IO.154

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLstatic void ReadItem(int itemId){// Obtain info about the objectusing (NorthPoleEntities e = new NorthPoleEntities()){// This does not cause the BLOB to be read with T-SQL as long as// the BLOB column is in a different object,// or excluded from the model altogether (preferred)Item i = e.Items.SingleOrDefault(itm => itm.ItemID == itemId);if (i != null){Console.WriteLine("Read Item GUID: {0}", i.ItemGuid.ToString());bool OpenedConn = false;try{// Ensure the connection is open before// attempting to start a transaction// NOTE: connection is probably closedif (e.Connection.State != System.Data.ConnectionState.Open){e.Connection.Open();OpenedConn = true;}// There are also ways to use// Systems.Transactions.TransactionScope// to enlist in an existing transaction// at the business logic level if desiredusing (System.Data.Common.DbTransaction tran =e.Connection.BeginTransaction()){// Now read the item using SqlFileStream,// leveraging the stored// procedure that's mapped as a functionGetSqlFileStreamInfo_Result info =e.GetSqlFileStreamInfo(i.ItemID).ToList()[0];// If there is a BLOB, there is a file pathif (!string.IsNullOrEmpty(info.FilePath) &&info.Size > 0){using (MemoryStream ms = new MemoryStream())155

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQL{using (SqlFileStream fs = new SqlFileStream(info.FilePath, info.Context, FileAccess.Read,FileOptions.WriteThrough, (long)info.Size)){fs.CopyTo(ms);}}}// TODO: Do some things with the memory stream,// or convert to byte array if serialization is required}tran.Commit();}}}}finally{if (OpenedConn) e.Connection.Close();}Listing 4-9:ReadItem method.This method first creates an Entity Framework object context (NorthPoleEntitiesclass) and retrieves the desired item. In this case, we used a lambda expression to indicatewhich item we are looking for. Lambda expressions may look a little intimidating, but thebasics can be learned easily. Here, we are indicating that, in the collection of all Items,we are looking for a single item that matches the predicate (think WHERE-clause) ItemID== ItemID, where the first ItemID is the scalar property and the second ItemID is theparameter value of the ReadItem function. We use the .SingleOrDefault() method toensure that if there is no item with the specified ItemID, the object context will return anull reference instead of throwing an exception.156

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLIf we successfully retrieved the item's relational data from the database, we willcreate a new SQL Server transaction after ensuring that we have an open databaseconnection (much like we did for the InsertItem method). We again leverage theGetSqlFileStreamInfo method that maps to the GetSqlFileStreamInfo storedprocedure to retrieve the information we need to create a SqlFileStream instance.Before we create the instance, we check if there is a valid filePath value (a null valuewould indicate there is no FILESTREAM data file for this data row) and that the size ofthe file is greater than zero (there is no sense in reading a zero-length file). We arecopying the SqlFileStream contents to a MemoryStream which can then be usedfor other purposes.To test the functionality in this lab, add the code in Listing 4-10 to the Main() method inthe Program class.int NewItemId = 0;NewItemId = InsertItem();ReadItem(NewItemId);Console.Write("Done...");Console.ReadKey();Listing 4-10:Sample code for void Main().Now compile and run the code. You should find that the program outputs the sameItem GUID twice, indicating that the item that was saved to the database using theInsertItem() method was successfully read back using the ReadItem() method.157

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLLab 3: FILESTREAM and LINQ to SQLThis lab presents a very basic example that will help you to get started writing applicationsthat use LINQ to SQL classes to access the FILESTREAM data. Just as we saw withthe default EF approach in Lab 2a, LINQ to SQL also accesses FILESTREAM data throughT-SQL interfaces only; it is not capable of accessing the FILESTREAM data using Win32or Managed APIs.To complete this lab, we will follow the procedure below.1. Create a console application project.2. Add a LINQ to SQL class object to the project.3. Create a LINQ to SQL wrapper class for the Items table.4. Use LINQ to SQL to add a new row with a FILESTREAM value to the table.5. Query the table using LINQ to SQL and retrieve FILESTREAM data.Creating a new console application projectLet's get started. Open Visual Studio, and create a new console application project byselecting File | New | Project as demonstrated in the previous labs.Adding a LINQ to SQL classThe next step is to add a LINQ to SQL class to the project. Right-click on the project inthe Solution Explorer and select Add | New Item. Then, in the Add New Item dialog box,select LINQ to SQL Classes, and rename the class to NorthPoleDB (Figure 4-14).158

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLFigure 4-14:Adding a LINQ to SQL class.Click Add to add the class to the current project. This will open the ObjectRelational Designer. The designer window will be empty because we have notyet done any mapping.Defining object relational mappingsThe next step is to define object relational mappings for the Items table. Open the ServerExplorer window by clicking the Server Explorer hyperlink in the Object RelationalDesigner window (Figure 4-15).159

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLFigure 4-15:The Server Explorer.Locate the Items table under the Tables node, then drag and drop the Items table intothe Object Relational Designer window (Figure 4-16).Figure 4-16:The Items table in the Object Relational Designer.160

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLNotice that LINQ to SQL automatically maps FILESTREAM columns to System.Data.Linq.Binary data type. You can see this by looking at the properties of the ItemImageproperty of the mapping object, as shown in Figure 4-17.Figure 4-17:FILESTREAM mapping to the System.Data.Linq.Binary data type.That is all you need to do. We are ready to write the code that performs FILESTREAMdata manipulation using LINQ to SQL classes.161

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLAdding a new rowLet us see how to add a new row into the Items table using LINQ to SQL classes, andpopulate the FILESTREAM column. Listing 4-11 shows the C# code to do this.static void SaveItem(){Item itm = new Item();itm.ItemDescription = "Microsoft Mouse";itm.ItemGuid = System.Guid.NewGuid();System.Data.Linq.Binary img;img = new System.Data.Linq.Binary(System.IO.File.ReadAllBytes("c:\\temp\\MicrosoftMouse.jpg"));itm.ItemImage = img;itm.ItemNumber = "IT001";}NorthPoleDBDataContext db = new NorthPoleDBDataContext();db.Items.InsertOnSubmit(itm);db.SubmitChanges();db.Dispose();Listing 4-11:Adding a new row using LINQ to SQL.Querying FILESTREAM data using LINQ to SQLThe final step in this lab is to see how to query and retrieve FILESTREAM data usingLINQ to SQL classes. Listing 4-12 searches for an item with a specific item number andretrieves the FILESTREAM value associated with the row.162

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLstatic void Main(string[] args){NorthPoleDBDataContext db = new NorthPoleDBDataContext();var itms = from p in db.Itemswhere p.ItemNumber == "IT001"select p;}foreach (var itm in itms){Console.WriteLine("Item Description = {0}",itm.ItemDescription);System.IO.File.WriteAllBytes("c:\\temp\\MicrosoftMouse.jpg",itm.ItemImage.ToArray());}db.Dispose();Listing 4-12:Querying a FILESTREAM table using LINQ to SQL.Now run your application or debug and step through the code to see how it works.Remember, LINQ to SQL accesses FILESTREAM data through the T-SQL interface.Streaming access to the FILESTREAM data is not available when accessing it throughLINQ to SQL. This might result in performance problems on systems that perform a largenumber of FILESTREAM operations using LINQ to SQL.SummaryWe have seen how to write applications that access FILESTREAM data usingADO.NET EF and LINQ to SQL. The former is one of the most popular objectrelational mapping frameworks available for .NET. The latter offers simplicityand remains a viable data access option, but has become less prevalent followingthe introduction of Entity Framework.163

Chapter 4: FILESTREAM with Entity Framework and LINQ to SQLNeither of these frameworks is ideal for working with FILESTREAM data. Both EF andLINQ to SQL use the T-SQL interface to access the FILESTREAM data by default. Thismeans that the performance benefits offered by the streaming APIs are not available toapplications that use these frameworks "out of the box" to access the FILESTREAM data.Performance may be really bad if the application performs a huge volume of FILESTREAMoperations using the native FILESTREAM mappings of either of these frameworks.We demonstrated an approach using EF (but that is equally valid for LINQ to SQL) thatuses specialized code in order to use the technology in conjunction with streaming accessto FILESTREAM.164

Chapter 5: FILESTREAM with ASP.NETand SilverlightThis chapter will explain how to access FILESTREAM data from applications builtusing ASP.NET and Silverlight. It will adopt a very hands-on approach, consisting ofthe six labs below.• Lab 4: Uploading files to a FILESTREAM database from an ASP.NET web page.• Lab 5: Deploying and configuring a FILESTREAM web application on IIS.• Lab 6: Creating a web handler to serve images from a FILESTREAM database.• Lab 7: Displaying thumbnail images of items from a FILESTREAM database onan ASP.NET web page.• Lab 8: Using SqlFileStream in an N-tier scenario.• Lab 9: Playing a video stored in a FILESTREAM database on a Silverlight application.The functionalities that we try to implement in these labs are probably familiar to mostof you. The first five labs in this chapter deal with uploading and downloading image files.Most developers will have done this numerous times in their applications, by uploadingthose files to a location on the server and serving the files upon client request later. Whatis different in these labs is that we will store the data in the FILESTREAM column of adatabase table, then serve client requests from there.We'll use the same sample database for Labs 4–8 in this chapter, as shown in Listing 5-1.165

Chapter 5: FILESTREAM with ASP.NET and SilverlightUSE masterGOIF DB_ID('NorthPole') IS NOT NULLDROP DATABASE NorthPoleGOCREATE DATABASE NorthPole ON PRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter5\NorthPole.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter5\NorthPole_fs') LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter5\NorthPole_log.ldf')GOListing 5-1:Rebuilding the sample NorthPole database.Listing 5-2 creates the usual Items table, with a FILESTREAM column.USE NorthPoleGOCREATE TABLE [dbo].[Items]([ItemID] INT IDENTITYPRIMARY KEY ,[ItemGuid] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL)GOListing 5-2:Creating the Items table.166

Chapter 5: FILESTREAM with ASP.NET and SilverlightLab 4: Uploading Files to a FILESTREAMDatabase from an ASP.NET Web PageMost websites provide some sort of functionality to upload files to the web server, usuallystoring them in a disk folder and serving the files later upon client requests. In this lab,we will see how to upload a file from an ASP.NET web page to the FILESTREAM columnof a table.To do this, we will follow the procedure below.1. Create an ASP.NET web application.2. Add a database connection string to the configuration file web.config.3. Design a data entry page with a file upload control.4. Write the code to upload the file into a FILESTREAM database.5. Run the web application and verify that the code works.Creating a new ASP.NET web applicationOpen Visual Studio, and select File | New | Project. Select ASP.NET Web Applicationfrom the project templates, and then click OK to create the project (Figure 5-1).If you are using Visual Studio 2010, you will see options to create projects targeting anumber of different .NET frameworks such as 2.0, 3.0, 3.5, and 4.0. The features we willexplore in this lab are available in all of these .NET Framework versions but, to keepthe complexity to the minimum, we will create this project in .NET 2.0. When usingFILESTREAM in projects targeting .NET Framework versions 2.0 or 3.0, Service Pack 2is required.167

Chapter 5: FILESTREAM with ASP.NET and SilverlightFigure 5-1:Creating a new project.Visual Studio automatically creates a web page named default.aspx, and we will usethat page for this lab.Adding a database connection stringThe next step is to add a database connection string to the configuration file of the webapplication, web.config. Locate web.config in the Solution Explorer, and double-clickon it to open the configuration file in the Visual Studio editor.Locate the section, connectionStrings, and add a connection string that points to theFILESTREAM database you would like to use for the labs. Listing 5-3 shows how the codeshould look after the connection string has been added to the configuration file (but youshould use the data source appropriate to your own connection).Ensure the connection string uses Windows authentication (IntegratedSecurity=SSPI); we have already learnt that Windows authentication is required toaccess the FILESTREAM data using Win32 or Managed APIs.168

Chapter 5: FILESTREAM with ASP.NET and SilverlightListing 5-3:Adding a connection string to point to the FILESTREAM database.Designing the data entry pageNext, we'll build a very simple data entry page that allows a user to enter productinformation and save it into the database along with an image of the product. Wewill add an HTML table to the web page and put two text boxes (for Item Number andItem Description), a file upload control for the item image file, and a button to submitthe data.Depending upon your preference, you might want to write your code in the XML sourceeditor or using the UI designer. The source code of the ASPX page is shown in Listing 5-4.ASP.NET FILESTREAM ApplicationItem Number:169

Chapter 5: FILESTREAM with ASP.NET and SilverlightItem Description:Item Image:Listing 5-4:The source code for the data entry page.Figure 5-2 shows how your page should look after the controls have been added tothe page.Figure 5-2:The data entry page.170

Chapter 5: FILESTREAM with ASP.NET and SilverlightWriting the "code-behind"The next task is to write the code that runs behind the web page. When the user clicksthe Save button, we need to save the item information, including the image file, into thedatabase. In this section, we will see the code required to implement this functionality.We will start with the import statements. Add the import statements in Listing 5-5 to thecode file associated with your web page.using System.IO;using System.Data.SqlClient;using System.data;Listing 5-5:Import statements.To make our code run when the user clicks the Save button, we'll add a handler for theClick event on the button. In a real-world scenario, you might want to do a number ofvalidations before saving the information into the database, such as checking whetherthe user has entered all the mandatory fields. To keep the code simple in this lab, we willskip these validations and validate only the file upload control; we will proceed with thedatabase call only if the user has selected a file using the file upload control provided onthe web page. This can be verified by checking the HasFile property of the file uploadcontrol, as shown in Listing 5-6.protected void btnSave_Click(object sender, EventArgs e){if (fpItemImage.HasFile){SaveItem();}}Listing 5-6:Handler for Click event of the button.171

Chapter 5: FILESTREAM with ASP.NET and SilverlightThe code in Listing 5-6 calls the SaveItem() function to perform the database operations,which we have not yet created. Saving the data into the database involves two steps.1. Save the relational data (ItemNumber, ItemDescription).2. Save the FILESTREAM data (ItemImage).Relational data is always saved using T-SQL interface, whereas we can use either T-SQLor Win32/Managed APIs to save the FILESTREAM data. We have learned earlier that theFILESTREAM read-write operations should always be done using Win32 or Managed APIsfor performance reasons, so that is exactly what we will do in this lab.Saving the relational dataThe first step is to save the relational data, in this case using the SaveItemImage storedprocedure in Listing 5-7, which we can call from our web application.CREATE PROCEDURE [dbo].[SaveItemImage](@ItemNumber VARCHAR(20) ,@ItemDescription VARCHAR(50))ASBEGININSERT INTO dbo.Items( ItemGuid ,ItemNumber ,ItemDescription ,ItemImage)OUTPUT GET_FILESTREAM_TRANSACTION_CONTEXT() AS txContext ,inserted.ItemImage.PathName() AS filePathSELECT NEWID() ,@ItemNumber ,@ItemDescription ,0xENDListing 5-7:Stored procedure to save item information.172

Chapter 5: FILESTREAM with ASP.NET and SilverlightNote that this stored procedure inserts a dummy value into the FILESTREAM columnbecause, as we have already seen in Chapter 3, Win32/Managed API access to aFILESTREAM column is possible only if the column is non-NULL.The stored procedure returns a result set that contains the PathName(), the logicalidentifier pointing to the disk file associated with the FILESTREAM column in the newlyinserted row, and a FILESTREAM transaction context of the currently active transaction.This is done to avoid an additional database call when writing the code to save theFILESTREAM data. For more information, refer back to Chapter 3.Saving the FILESTREAM dataAfter the relational data is saved, we will execute the code to save the FILESTREAM datausing the FILESTREAM Managed API. We saw in Chapter 3 an example of how to load thecontent of a disk file into the FILESTREAM column. In this case, we will obtain a streampointing to the file selected by the user through the FileContent property of the fileupload control (Listing 5-8).private void SaveItem(){//Create and open a database connectionSqlConnection cn = new SqlConnection();cn.ConnectionString =System.Configuration.ConfigurationManager.ConnectionStrings["NorthPoleDB"].ToString();Cn.Open();//Create a Command objectSqlCommand cmd = new SqlCommand();cmd.CommandText = "SaveItemImage";cmd.Connection = cn;cmd.CommandType = CommandType.StoredProcedure;cmd.Parameters.AddWithValue("@ItemNumber", txtItemNumber.Text);cmd.Parameters.AddWithValue("@ItemDescription",txtItemDescription.Text);173

Chapter 5: FILESTREAM with ASP.NET and Silverlight//Begin a transactionSqlTransaction transaction = cn.BeginTransaction("ItemTran");cmd.Transaction = transaction;byte[] txContext = null;string filePath = string.Empty;try{//Execute the commandSqlDataReader reader = cmd.ExecuteReader();if (reader.Read()){txContext = (reader["txContext"] as byte[]);filePath = reader["filePath"].ToString();}Reader.Close();//Open the FILESTREAM data file for writingSqlFileStream fs = new SqlFileStream(filePath, txContext,FileAccess.Write);//Get a stream object pointing to the content to be uploadedStream localFile = fpItemImage.FileContent;//Start transferring data into FILESTREAM data file// If you are using .NET 4.0 the next 11 lines of code can be// replaced with a single call to "localFile.CopyTo(fs, 4096);"BinaryWriter bw = new BinaryWriter(fs);const int bufferSize = 4096;byte[] buffer = new byte[bufferSize];int byteCount = localFile.Read(buffer, 0, bufferSize);while (byteCount == bufferSize){bw.Write(buffer, 0, bytes);bw.Flush();byteCount = localFile.Read(buffer, 0, bufferSize);}Bw.Close();}//Close the fileslocalFile.Close();fs.Close();174

Chapter 5: FILESTREAM with ASP.NET and Silverlight}catch (Exception ex){//Do something with the exception}finally{//Commit the transaction and close the connectionCmd.Transaction.Commit();cn.Close();}Listing 5-8:Saving item information to a FILESTREAM database.Running the applicationWe have successfully completed the coding for this lab, so it is time for us to run theapplication and see it in action: press F5 or click the Start Debugging button on thetoolbar. This will open the web page in a new web browser instance, as shown inFigure 5-3.Figure 5-3:The web page.Enter an item number and description, select an image file using the Browse button, andthen click Save to insert the information into the database.175

Chapter 5: FILESTREAM with ASP.NET and SilverlightThe code we wrote in the Click event of the Save button will insert a row into theItems table along with the FILESTREAM data. Verify that the information is correctlysaved in the database by running SELECT * FROM Items in SSMS. If everything is set upcorrectly, you will see that the data you entered in the web page shows up in the SSMSresult window.Lab 5: Deploying and Configuring a FILESTREAMWeb Application on IISIn Lab 4, we created, ran, and tested a web application, all within Visual Studio. Bydefault, when you run a web application from Visual Studio it runs on a built-in webserver, which is designed for debugging and testing web applications in a developmentenvironment.In a production environment, we usually configure our web application on more robustweb servers, such as Microsoft IIS. Configuring and running a web application that dealswith FILESTREAM data on IIS needs some careful attention.In this lab, we will see how to deploy the web application we created in Lab 4 on IIS 7.0.There are different ways of deploying a web application on a web server; we will use theinbuilt functionality provided by Visual Studio to keep the process simple.To do this, we will:• run Visual Studio under an administrator account• publish the web application to a new virtual directory on local IIS• configure security for FILESTREAM access• run and test the application.176

Chapter 5: FILESTREAM with ASP.NET and SilverlightRunning Visual Studio under an administratoraccountTo deploy a web application from Visual Studio, you need to run Visual Studio underan administrator account. With administrative privileges on the computer, we can dothis by right-clicking on the Visual Studio option in the Start menu and selectingRun as administrator.Publishing the web application into a new virtualdirectoryIn this step, we will publish the web application to a new virtual directory on local IIS.Right-click on the project name in Solution Explorer, and select Publish. This will openthe Publish Web dialog box (Figure 5-4).Figure 5-4:The Publish Web dialog box.177

Chapter 5: FILESTREAM with ASP.NET and SilverlightThe Publish Web dialog box provides a number of different publication options. We aregoing to deploy it into a local IIS virtual directory, so we will select File System from thedrop-down list. Click on the […] button and it will open a dialog box that enables us toselect a virtual directory under the local IIS server.Next, we create a new virtual directory for our application. Click the Local IIS button andthen click the New Virtual Directory button on the top right of the page to create a newvirtual directory (Figure 5-5).Figure 5-5:Selecting a virtual directory under the local IIS server.We'll call the new virtual directory fsweb. Choose the file system location for thepublished files and click OK to return to the parent window (Figure 5-6).Figure 5-6:Creating the new virtual directory.178

Chapter 5: FILESTREAM with ASP.NET and SilverlightSelect the new virtual directory that we just created (fsweb) (Figure 5-7).Figure 5-7:Selecting the target location.Click OK to proceed to the next step. The Publish Web dialog box is displayed again,showing our selections (Figure 5-8).Figure 5-8:Starting the publication.179

Chapter 5: FILESTREAM with ASP.NET and SilverlightWe have selected the Publish Method and Target Location, so now click the Publishbutton to start the publication process. The project is published into the fsweb virtualdirectory under the local IIS web server. The Visual Studio status bar will show theprogress of the operation.When the operation completes, we can go ahead and run the application from thebrowser. Open a web browser and enter: http://localhost/fsweb/default.aspx (or theappropriate URL if your local configuration is different). This will open the web page wecreated in the previous lab (Figure 5-9).Figure 5-9:The resulting web page.Now fill the data entry page with information for a new item, and click Save to send theinformation to the database. Surprisingly, in most cases you will see the error shown inFigure 5-10 (or similar) when you click Save.180

Chapter 5: FILESTREAM with ASP.NET and SilverlightFigure 5-10:Error displayed when you click Save.The Open() method of the connection object raises an error. This is because we haveconfigured the connection to use Integrated Security (Windows authentication), butthe user account under which the web application runs does not have permissionsto access the database. However, Windows authentication is necessary for accessingthe FILESTREAM data through the Managed or Win32 APIs, which means we need topersevere with Integrated Security. So, we now need to fix this error by reconfiguring thesecurity settings.Configuring IIS to use SQL Server IntegratedSecurityThere are a number of ways to fix the login error we saw in the previous section, and themost suitable method depends upon the security level required, the type of application,the way users access the application, and so on. We'll use one of the simpler options here.181

Chapter 5: FILESTREAM with ASP.NET and SilverlightTo begin, open IIS Manager by typing inetmgr in the Run window, or by selecting itfrom the Administrative Tools section of the control panel. Then, in the IIS Managerwindow, right-click on the Application Pools node and select Add Application Pool toopen the Add Application Pool dialog box.Now, we can create a new application pool for our FILESTREAM web application. We'llcall the application pool fsweb and select .NET version 2.0, since we created a .NET 2.0application (Figure 5-11).Figure 5-11:The Add Application Pool dialog box.Click OK, and IIS Manager creates a new application pool, which will show up in the listof application pools (Figure 5-12).182

Chapter 5: FILESTREAM with ASP.NET and SilverlightFigure 5-12:The new application pool.In the list of application pools, right-click fsweb and select Advanced Properties, orselect fsweb and click the Advanced Properties link on the right-hand side of the IISManager window. In the Advanced Settings dialog box, select the Identity property andclick on the button on the right to change the value of the property (Figure 5-13).Figure 5-13:Advanced Settings dialog box.183

Chapter 5: FILESTREAM with ASP.NET and SilverlightIn the Application Pool Identity window, select Built-in account, and then selectNetworkService from the drop-down list (Figure 5-14).Figure 5-14:Changing the Identity setting.Click OK to save the settings, and close the Advanced Settings dialog box.We have created a new application pool with the identity of NetworkService. Nowwe'll associate this newly created application pool with our web application. In the IISManager, select the virtual directory we created earlier (fsweb), and select the BasicSettings option from the Actions section on the right-hand side. Then, on the EditApplication dialog box, click Select to change the application pool associated with ourweb application (Figure 5-15).184

Chapter 5: FILESTREAM with ASP.NET and SilverlightFigure 5-15:Changing the application pool.In the Select Application Pool dialog box, select the new fsweb application pool(Figure 5-16).Figure 5-16:Selecting the fsweb application pool.185

Chapter 5: FILESTREAM with ASP.NET and SilverlightClick OK to return to the parent window, and click OK again to save the changes andreturn to the IIS Manager. After completing this step, our application is now configuredto run under the identity of NetworkService.The next step is to make sure that the Network Service account has permissions to accessour database. If it does not have access to the SQL Server instance, we need to create anew login with the required permissions. We can do this using SSMS, by right-clicking onLogins in the SQL Server instance tree and selecting New Login (Figure 5-19).Figure 5-17:Creating a new login for the Network Service account.The New Login dialog box is displayed. In Login name, select the fully qualified nameof the Network Service account. The easiest way could be to use the Search button andselect the account from the search results (Figure 5-18). This assumes that the SQL Serveris running on the same machine as the IIS app pool. If not, the web server's machineaccount needs to be given permissions.186

Chapter 5: FILESTREAM with ASP.NET and SilverlightFigure 5-18:Specifying the name of the Network Service account.Now go to the User Mapping page. We will give the new account permissions to accessthe NorthPole database. For the purpose of this lab, I am assigning the db_owner roleto the newly created login (Figure 5-19). However, in most real-world scenarios you wouldprobably assign a lesser set of permissions; it is very important to make sure you give onlythose permissions that are absolutely necessary. In a typical production environment, thedatabase administrator will do this based on the security policy of the organization.Figure 5-19:Assigning permissions to the Network Service login.With this, we are done with security configuration and it is time to test the website andverify that it works.187

Chapter 5: FILESTREAM with ASP.NET and SilverlightRunning the applicationRun the application again on your favorite web browser. Enter the item information,including the location of an image file to load into the FILESTREAM column, and clickSave to push the information to the database (Figure 5-20).Figure 5-20:Running the application.This time, the information should be successfully added into the database. You can verifythis by looking in the Items table; in SSMS, run a SELECT * from Items query on theItems table to see whether the newly entered data shows up in the table.Congratulations! You have successfully completed the second ASP.NET lab. It is time tomove on to the next lab and explore some more interesting stuff.188

Chapter 5: FILESTREAM with ASP.NET and SilverlightLab 6: Creating a Web Handler to Serve Imagesfrom a FILESTREAM DatabaseIn Lab 4, we saw how to upload files into the FILESTREAM column of a SQL Serverdatabase from a web page. In this lab, we will see how to serve those files back to theclient applications, upon request.We will create a web page to retrieve images from the database and send them to theclient browser. A web page is not the best way to implement such a handler, but at thisstage we want to minimize complexity, so that we can focus on FILESTREAM and noton ASP.NET features. In a later lab, we will develop a similar solution using a generichandler, which is a much better approach.To do this, we will:• create a stored procedure to retrieve the FILESTREAM information• create the web page that serves item images• run and test the application.Creating a stored procedure to retrieveFILESTREAM dataStart by creating a stored procedure that will retrieve the logical path name of theFILESTREAM value, and the FILESTREAM transaction context. This stored procedure,shown in Listing 5-9, will take an item number as input parameter and will return theFILESTREAM information.189

Chapter 5: FILESTREAM with ASP.NET and SilverlightCREATE PROCEDURE GetItemImageInfo(@ItemNumber VARCHAR(20))ASBEGINSELECT ItemImage.PathName() AS filePath ,GET_FILESTREAM_TRANSACTION_CONTEXT() AS txContextFROM ItemsWHERE ItemNumber = @ItemNumberENDListing 5-9:Creating the stored procedure.Creating the web page to serve imagesThe next step is to create a web page that can serve images to the client browsers. Add anew ASP.NET web form to the project and name it GetItemImage.aspx.The design part of this page does not need any changes because this page is not used asa UI element; its purpose is only to serve the image of the specified item to the clientbrowser. In most cases, we'd use a generic handler (.ashx file) for this type of scenario,which will be slightly faster than using an ASPX page. However, we will use an ASPX pagefor simplicity.Let's start writing the code for this page. We'll begin by adding the necessary importstatements to the code file GetItemImage.aspx.cs.using System.IO;using System.Data.SqlClient;using System.Data;using System.Data.SqlTypes;Listing 5-10:The import statements.190

Chapter 5: FILESTREAM with ASP.NET and SilverlightThe web page expects a parameter in the query string named ItemNumber (Listing 5-11).protected void Page_Load(object sender, EventArgs e){string ItemNumber;ItemNumber = Request.QueryString["ItemNumber"].ToString();//Create and open a database connectionSqlConnection cn = new SqlConnection();cn.ConnectionString =System.Configuration.ConfigurationManager.ConnectionStrings["NorthPoleDB"].ToString();cn.Open();//Create a Command objectSqlCommand cmd = new SqlCommand();cmd.CommandText = "GetItemImageInfo";cmd.Connection = cn;cmd.CommandType = System.Data.CommandType.StoredProcedure;cmd.Parameters.AddWithValue("@ItemNumber", ItemNumber);//Begin a transactionSqlTransaction trn = cn.BeginTransaction("ItemTran");cmd.Transaction = trn;string filePath = null;byte[] txContext = null;try{//Execute the commandSqlDataReader reader = cmd.ExecuteReader();if (reader.Read()){txContext = (reader["txContext"] as byte[]);filePath = reader["filePath"].ToString();}reader.Close();// Open the FILESTREAM data file for readingSqlFileStream fs = new SqlFileStream(filePath, txContext,FileAccess.Read);191

Chapter 5: FILESTREAM with ASP.NET and Silverlight// Start transferring data from the FILESTREAM data fileResponse.BufferOutput = false;Response.ContentType = "image/jpeg";BinaryWriter bw = new BinaryWriter(Response.OutputStream);const int bufferSize = 4096;byte[] buffer = new byte[bufferSize];int byteCount = fs.Read(buffer, 0, bufferSize);while (byteCount == bufferSize){bw.Write(buffer, 0, byteCount);byteCount = fs.Read(buffer, 0, bufferSize);}}//Close the filesResponse.End();bw.Close();fs.Close();}catch (Exception ex){//do some error handling}finally{// Commit the transaction and close the connectioncmd.Transaction.Commit();cn.Close();}Listing 5-11:Code to serve images stored in a FILESTREAM column.What is interesting and worth noticing here is how we are sending the data back to theweb server. We do this through the ASP.NET Response object. We first set the contenttype to image/jpeg to tell the browser that we are sending back an image. Then weaccess the OutputStream property of the Response object and write the content of theFILESTREAM file into it (Listing 5-12).192

Chapter 5: FILESTREAM with ASP.NET and SilverlightResponse.BufferOutput = false;Response.ContentType = "image/jpeg";BinaryWriter bw = new BinaryWriter(Response.OutputStream);Listing 5-12:Accessing the OutputStream property of the Response object.Refer to Chapter 12, in the section entitled Best Practices for FILESTREAM Development,to learn about the potentially useful addition of a timestamp column to allow for cachingFILESTREAM data that is sent over the web.Running and testing the applicationWe have finished the code, so we can now verify that the application is working asexpected. In Visual Studio, set the web page that we created in this lab as the default page,and run the application.When the web page is open in the browser, add a parameter to the query string. Enter theitem number of a record that exists in the table. This will retrieve the image associatedwith the specified item, and the image will be displayed on your browser (Figure 5-21).Figure 5-21:Running the application.In the next lab we will see how to use this page as the URL source for item thumbnailimages we display on a web page.193

Chapter 5: FILESTREAM with ASP.NET and SilverlightLab 7: Displaying Thumbnail Images of Itemsfrom a FILESTREAM Database on an ASP.NETWeb PageIn this lab, we will create a web page that will display a list of items along with theirthumbnail images. We will use the web handler we created in the previous lab to fetchimages stored in the FILESTREAM column of a SQL Server database.To do this, we will:• create a stored procedure to fetch item details• add a new web form and a Grid View• create a data source that retrieves item information, and bind it to the Grid View• configure the Grid View• run and test the web page.Creating the stored procedure to fetch iteminformationThe first step is to create a stored procedure that returns the list of all the items in theItems table. However, before you begin, add a few records to the Items table (I'll leavethis as an exercise for the reader – I have populated my sample table with four rows, withfour different products and their thumbnail images).194

Chapter 5: FILESTREAM with ASP.NET and SilverlightNow, in the SSMS window run the code in Listing 5-13 to create the stored procedure.CREATE PROCEDURE GetItemListASBEGINSELECT ItemNumber ,ItemDescription ,'/GetItemImage.aspx?ItemNumber=' + ItemNumber AS ImageURLFROM ItemsENDListing 5-13:Creating a stored procedure to retrieve item information.Notice the third column returned by the stored procedure; the value returned by thiscolumn points to the page we created in the previous lab to return images from thedatabase. The value returned by this column will be used to bind the image control on thegrid that will display the thumbnails. Run the application we created in the previous laband load a few more images into the FILESTREAM table.Listing 5-14 shows an example results set from the stored procedure, based on the data Ientered for this demonstration.ItemNumber ItemDescription ImageURL---------- --------------- ----------------------------------1001 Microsoft Mouse /GetItemImage.aspx?ItemNumber=10013001 Dell Laptop /GetItemImage.aspx?ItemNumber=30014005 Apple iPad /GetItemImage.aspx?ItemNumber=40056009 Amazon Kindle /GetItemImage.aspx?ItemNumber=6009Listing 5-14:Example results from the stored procedure.195

Chapter 5: FILESTREAM with ASP.NET and SilverlightAdding a new web form and data gridNext, add a new web form named ItemList.aspx to the project, then drag a GridViewcontrol from the toolbox and drop it on the web form (Figure 5-22).Figure 5-22:A GridView control.Adding a data sourceIn this step, we will create a data source that fetches the item details, and binds it tothe GridView control. Go to the properties of the GridView control, and locate theDataSourceID property. From the drop-down list, select New data source (Figure 5-23).Figure 5-23:Creating a new data source.196

Chapter 5: FILESTREAM with ASP.NET and SilverlightThis will open the Data Source Configuration Wizard, which will take you through theconfiguration process. From the list of data source types available, select Database. We'llname the data source itemds (Figure 5-24).Figure 5-24:Specifying an ID for the data source.In the next dialog box, click New Connection to create and configure a new connectionto the database. The Add Connection dialog box is displayed (Figure 5-25).Note that I have used Windows authentication for consistency only; Windows authenticationis not necessary for this connection because it is used to execute the storedprocedure that fetches item details, but not the images. We will use the handler pagewe created in the previous lab to fetch the actual images, and that page is already usingWindows authentication.197

Chapter 5: FILESTREAM with ASP.NET and SilverlightFigure 5-25:The Add Connection dialog box.Click OK on the Add Connection dialog box, and then click Next to move to the nextstep in the wizard. Click Next again, to save the connection string to the applicationconfiguration file.198

Chapter 5: FILESTREAM with ASP.NET and SilverlightWhen you see the Configure the Select Statement step, select Specify a custom SQLstatement or stored procedure, because we already created the stored procedure toretrieve item information (Figure 5-26).Figure 5-26:Configuring the Select Statement.In Define Custom Statements or Stored Procedures, select Stored procedure, and fromthe list of stored procedures select GetItemList (Figure 5-27). Then click Next.Figure 5-27:Selecting the stored procedure.199

Chapter 5: FILESTREAM with ASP.NET and SilverlightIn the next step, click the Test Query button to verify that the stored procedure returnsthe expected results (Figure 5-28).Figure 5-28:Previewing the data.This is the last step of the wizard. Click Finish to return to the web page designerwindow. When the wizard exits, it automatically binds the columns returned by thestored procedure with the columns in the grid control. By default, all columns are boundto text boxes (Figure 5-29).Figure 5-29:Selecting the Edit Columns option.200

Chapter 5: FILESTREAM with ASP.NET and SilverlightThis default is alright for the first two columns, but on the last column we need an imagecontrol. We need to edit the grid control and change the third column from text box toimage control. Click the Edit Columns link to open the column editor window. In theFields dialog box, remove the ImageURL column that the wizard automatically added(Figure 5-30).Figure 5-30:Removing the ImageURL text column.Now, add an ImageField from the Available fields list, and then set theDataImageUrlField property of the image field to the ImageURL columnof the data source (Figure 5-31).201

Chapter 5: FILESTREAM with ASP.NET and SilverlightFigure 5-31:Adding an ImageField.You might also want to change the heading of the columns, and set the height and widthof the image control so that it looks good on the web page, and modify the layout andformat of the text to make it look even better. Figure 5-32 shows how my grid controllooks in the designer after applying some formatting changes.Figure 5-32:The grid control.202

Chapter 5: FILESTREAM with ASP.NET and SilverlightThat is all we need to do in this lab. Our web page is ready. Let's build the project, set thedefault page to the newly created web form, and run the application (Figure 5-33).Figure 5-33:Running the application.You will see the web page load the details of the items in the database and display thethumbnail images associated with those items.203

Chapter 5: FILESTREAM with ASP.NET and SilverlightLab 8: Using SqlFileStream in an N-Tier ScenarioFor many reasons, today's web applications are often architected to run across multiplephysical computers, referred to as tiers. Using FILESTREAM in a multi-tier designrequires attention to an important detail: in order to achieve the most benefit fromFILESTREAM, it should be accessed using the Win32 streaming APIs. However, usingSqlFileStream also requires an active transaction in the SQL Server database. Theconsumer of the FILESTREAM data is most often the tier closest to the client – andfurthest away from the database. If the SqlFileStream API is used, this requires thatthis tier has at least some knowledge of the data store. In general, this violates goodarchitectural practices.In this lab, we will create a solution that makes the presentation tier as independent onthis database transaction as possible, while maintaining the performance benefits offeredby streaming access to FILESTREAM data. To fully understand the problem involved, let'sconsider the following example scenario. A web application must handle requests fromclients for which the response includes data stored in a SQL Server FILESTREAM column.The web application will be hosted on a front-end IIS web server, while the SQL Serverdatabase is hosted on a back-end server. The web application's developers aim to makethe application as efficient as possible, and want to use the SqlFileStream classand copy the stream's contents to the Response object's OutputStream (as shown inListing 5-11 as part of Lab 6). They have four options, as shown below.• Start the SqlTransaction required for the SqlFileStream to use in thepresentation tier.This solution requires that the web application has intimate knowledge of thedatabase's configuration and details. Generally, one of the architectural goals of amulti-layered web application is to avoid this.• Start a SqlTransaction and create the SqlFileStream in the data access layerand pass the SqlFileStream back to the client application.This solution does not address the concern that the SqlTransaction must be activewhile the presentation layer uses the SqlFileStream object. The client, however, has204

Chapter 5: FILESTREAM with ASP.NET and Silverlightno control over the SqlTransaction, and cannot guarantee, either that the transactionwill still be active, or that the transaction will be properly disposed when thework is done.• Create a custom class that will act as a domain object that encapsulates the requireddatabase objects and the SqlFileStream.An object of this custom class will be instantiated in the data access layer and passedon to the presentation layer. This object will implement IDisposable to ensure thatthe SqlTransaction and the SqlConnection objects encapsulated will be properlydisposed when the client is finished with the work. This may ensure somewhat betterperformance than the fourth solution which, as we will see in a moment, requires anadditional stream copy. However, it does not fully abstract the data access method.• Perform all data access, including the streaming access to FILESTREAM data, in thedata access layer and copy the stream data to a MemoryStream. Then return theMemoryStream to the caller.This is the most appropriate solution and will be demonstrated in this lab. Wecreate the SqlFileStream object in the data access layer and copy the contentsto a MemoryStream. We then pass the MemoryStream reference through thetiers. This involves an extra step (the stream copy from SqlFileStream toMemoryStream), which, in my view, provides more than sufficient benefitsbecause the client is completely unaware of the source of the streamed data.Creating the Visual Studio solution and projectsFor this lab, we will use the same sample database created previously. We will create anew Visual Studio solution and create three projects: one for domain classes, one for adata access layer, each implemented as a class library, and a third for a presentation layer,implemented as a web application.205

Chapter 5: FILESTREAM with ASP.NET and SilverlightOpen Visual Studio and create a blank solution by selecting File | New | Project… fromthe menu. In the New Project dialog box, under Other Project Types, select VisualStudio Solutions and then Blank Solution. Click OK to create the new, blank solution.In Solution Explorer, right-click the solution and select Add | New Project…. Select theASP.NET Empty Web Application project template. In this lab, we will give this projectthe name Presentation. Click OK to create this new project as part of the solution.Follow the same process for adding a second new project to the solution, this timeusing the Class Library template. Name this project DataAccess. Delete the defaultClass1.cs file from the project. Repeat this process one more time to create anotherclass library with the name Domain.We will now add references to the appropriate projects. The web application will call intothe data access layer to retrieve information from the SQL Server database. Both the webapplication and the data access layer will use the domain objects. We must therefore addthe following references in the Presentation project:• Domain• DataAccess.And in the DataAccess project:• Domain• System.Configuration.In order to start the web application showing an image using the handler we will develop,we must change the start page. Right-click on the web application and select Properties.In the Web tab, change the Start Action to Specific Page and enter GetItemImage.ashx?itemID=1 as the value. We are now ready to start coding the solution.206

Chapter 5: FILESTREAM with ASP.NET and SilverlightWriting the domain object codeIn a real-world project, you would create a domain object class for every entity your applicationwill use. Because our database stores little in the way of entities and our sampleweb application will not actually use it, we will skip this step altogether. We did create theDomain project, but only to illustrate the typical architectural setup of such a solution.Writing the data access codeIn the DataAccess project, create a single class with the name ItemData. Change theaccess modifier of the class to public. At the top of the code file, import the namespacesSystem.Configuration, System.Data.SqlClient, System.Data.SqlTypes andSystem.IO. The class will have one method, which retrieves the FILESTREAM data for aspecified ItemID (Listing 5-15).public class ItemData{public MemoryStream GetItemImage(string itemID){MemoryStream ms = null;string ConnectionString = ConfigurationManager.ConnectionStrings["NorthPoleDB"].ToString();string CmdText = @"SELECT ItemImage.PathName(),GET_FILESTREAM_TRANSACTION_CONTEXT()FROM Items WHERE ItemID = @itemID";using (SqlConnection conn = new SqlConnection(ConnectionString))using (SqlCommand cmd = new SqlCommand(CmdText, conn)){conn.Open();using (SqlTransaction tran = conn.BeginTransaction())207

Chapter 5: FILESTREAM with ASP.NET and Silverlight{cmd.Transaction = tran;cmd.Parameters.Add(new SqlParameter("itemID",System.Data.SqlDbType.VarChar, 20));cmd.Parameters["itemID"].Value = itemID;using (SqlDataReader dr = cmd.ExecuteReader()){if (dr.HasRows && dr.Read()){using (SqlFileStream sfs = new SqlFileStream(dr.GetString(0),(byte[])dr.GetValue(1), FileAccess.Read)){// This construct is used to avoid problems with a// FILESTREAM data file that exceeds 4 GB in sizems = new MemoryStream((int)System.Math.Min(sfs.Length, int.MaxValue));}}}}}// Copy the SqlFileStream contents to the// MemoryStream (requires .NET 4.0 and later)sfs.CopyTo(ms);}}return ms;Listing 5-15:Code for the ItemData class.In this method, new SqlConnection and SqlTransaction objects are created.Those are then used to retrieve the information necessary to create an instance of theSqlFileStream class. Recall that in Chapter 4, as part of the second Entity Frameworklab, we created a stored procedure that encapsulates this. That stored procedure could bereused here instead of the embedded SQL statement. Then the contents of theSqlFileStream are copied to a new instance of System.IO.MemoryStream,which is then used as the return value of the method.208

Chapter 5: FILESTREAM with ASP.NET and SilverlightAlternative to the Stream.CopyTo() methodThe copy of the stream content in this lab is done using the Stream.CopyTo() method, which wasintroduced with the .NET Framework Version 4. If your application targets an older Framework version,refer to the other code samples in this chapter for a way of copying the contents of one stream to anotherusing a 4 KB buffer (which is what Stream.CopyTo() also does).Creating an HTTP handler to serve imagesThe concept of showing the web page in a web browser will be very similar to theone used in Lab 6, but the code will look different. There will be no reference to anySQL types at all; any data access is completely abstracted in the data access layer (theDataAccess project).To drive this point home, in the Presentation web application, remove the reference tothe System.Data assembly. Then we will also modify the web.config file to includethe database connection string. Refer to Listing 5-3 for this code. In the final version ofthe application, this connection string can be removed. Because we are taking a stepby-stepapproach to this implementation, we will initially test the code whereby theweb application calls into the data access component directly, and the web application'sconfiguration file will provide the required connection string information.209

Chapter 5: FILESTREAM with ASP.NET and SilverlightTo illustrate a slightly more advanced ASP.NET concept, for this lab we willimplement a generic handler, rather than repurposing a web form as a handler.Right-click on the Presentation project, and select Add | New Item and in theAdd New Item dialog box, select the Generic Handler template and give the newitem the name GetItemImage.ashx. At the top of the file, add namespace importstatements for the DataAcess and System.IO namespaces.We will write our code in ProcessRequest(), which is the method called by theASP.NET runtime if the handler is invoked. The code is shown in Listing 5-16.public void ProcessRequest(HttpContext context){context.Response.ContentType = System.Net.Mime.MediaTypeNames.Image.Jpeg;context.Response.BufferOutput = false;}ItemData data = new ItemData();MemoryStream ms = data.GetItemImage(context.Request.QueryString["itemID"]);if (ms != null && ms.Length > 0){context.Response.AddHeader("content-length", ms.Length.ToString());// Reset the current position in the MemoryStreamms.Position = 0;// .NET 4.0 and later method onlyms.CopyTo(context.Response.OutputStream);}Listing 5-16:ProcessRequest() method code.This code contains no references to SQL, connection strings, or other database access.210

Chapter 5: FILESTREAM with ASP.NET and SilverlightThis code will run as is, but we have not yet enabled remote access to the DataAccesscomponent. However, at this time, you may want to run and test the web application toensure that it returns the image as expected, even though we do not yet have a way tosplit the data access layer from the web application.Adding remote calling capabilityIf we want to truly separate the web application from the data access layer, we musthave a way to call the methods in the data access class ItemData remotely. WindowsCommunication Foundation (WCF) is one .NET technology that allows inter-processand inter-machine communication. WCF was introduced with .NET FrameworkVersion 3 and has seen upgrades in more recent releases. An older technology forcalling remote code is called .NET Remoting. Even though the .NET Framework stillfully supports remoting, WCF is generally recommended. WCF services are easier toimplement because less "plumbing" code is required. A thorough discussion of theconcepts of WCF is beyond the scope of this book. However, the lab should be easy tofollow with a basic understanding of the "ABCs" of WCF: Address + Binding + Contract =Service, as provided by this MSDN article: http://brurl.com/fs6.To add remote calling capability to our application, we will first create a componentthat will contain the definition of our WCF service contract. Then, we create a consoleapplication to host the data access component as a WCF service. Finally, we will modifythe web application to call the data access component running in the console host.Creating and implementing the service interfaceIn Solution Explorer, right-click the solution node and select Add, New Project…. Inthe Add New Project dialog box, select Class Library and name it ServiceInterfaces.Add a reference to the System.ServiceModel assembly and DataAccess project to thenew project. Remove the default Class1.cs file and add a new interface (right-click211

Chapter 5: FILESTREAM with ASP.NET and Silverlightthe project, select Add | New Item… | Interface) with the name IItemData. At the topof the code file, import the System.IO and System.ServiceModel namespaces. Thedefinition of the IItemData interface is shown in Listing 5-17.[ServiceContract()]Public interface IItemData{[OperationContract()]MemoryStream GetItemImage(string itemID);}Listing 5-17:IItemData interface definition.We now need to modify the DataAccess.ItemData class to implement this newinterface. First, add a reference to the new ServiceInterfaces project in the DataAccessproject. Then, modify the definition of the ItemData class as shown in Listing 5-18. Nochange is required, other than to write the interface name, because the ItemData classalready implicitly has a GetItemImage() method with the correct signature.Public class ItemData : ServiceInterfaces.IItemData{/* Other code ommitted for brevity */}Listing 5-18:ItemData class implements IItemData interface.Creating the WCF service hostFor this basic lab, we will self-host a WCF service in a console application. In a real-worldscenario, you might consider creating a Windows Service or hosting the WCF serviceunder IIS.212

Chapter 5: FILESTREAM with ASP.NET and SilverlightCreate another new project using the Console Application template. Name this projectConsoleHost. Add a reference to the ServiceInterfaces and DataAccess project and alsoto the System.ServiceModel assembly.At the top of the Program.cs code file, import the namespacesSystem.ServiceModel, System.ServiceModel.Description,DataAccess and ServiceInterfaces. Create the following code inthe Main() method (Listing 5-19).Uri a = new Uri("net.tcp://localhost:4133/itemData");using (ServiceHost svchost = new ServiceHost(typeof(ItemData), a)){ServiceEndpoint sep = svchost.AddServiceEndpoint(typeof(IItemData),new NetTcpBinding(), "");svchost.Open();Console.WriteLine("The following EndpointListeners are active:\n");foreach (ServiceEndpoint l in svchost.Description.Endpoints)Console.WriteLine(l.Address);Console.Write("\nListening for requests. Press any key to stop...");Console.ReadKey();}svchost.Close();Listing 5-19:Creating a WCF service hosted in a console application.Depending on your system configuration, you may need to select a TCP port differentthan 4133 if that port is already in use on your system (Port 4133 was arbitrarily selectedfor this example). We are using the TCP transport here, because we need to transporta MemoryStream object, which is not serializable. We are implementing the ABCs ofWCF in these lines of code: after creating a ServiceHost instance (S) at our base URI(A), we are creating the TCP-based endpoint (B) that uses the IItemData interface as thecontract (C). We can then open the communication channel and wait for clients to call.213

Chapter 5: FILESTREAM with ASP.NET and SilverlightBecause our data access component is now effectively hosted in this console application,we need to provide the database connection string in the application's App.config file.Add a new application configuration file named App.config and add the connectionstring information like in Listing 5-3.Modifying the web application to call the WCF serviceWe are now ready to modify the web application. If we did not have control over theservice, we might choose to create a Service Reference with a proxy class. This requiresa few extra configuration steps on the service side. Because we do have control over theservice, we will share the IItemData interface between the service and the client byadding a reference to the ServiceInterfaces project in the web application. We must alsoadd a reference to the System.ServiceModel assembly, but we can remove the referenceto the DataAccess project.In the GetItemImage.ashx handler, we need to modify the code so that we no longercreate an instance of the ItemData class directly, but rather call a remote instanceusing WCF. Listing 5-20 shows the new code for the ProcessRequest() method. Thechanged lines of code are in bold format and underlined. Also add a namespace import forSystem.ServiceModel at the top of the code file.public void ProcessRequest(HttpContext context){context.Response.ContentType = System.Net.Mime.MediaTypeNames.Image.Jpeg;context.Response.BufferOutput = false;EndpointAddress a = new EndpointAddress("net.tcp://localhost:4133/itemData");ServiceInterfaces.IItemData data =ChannelFactory.CreateChannel(new NetTcpBinding(), a);MemoryStream ms = data.GetItemImage(context.Request.QueryString["itemID"]);214

Chapter 5: FILESTREAM with ASP.NET and Silverlight}if (ms != null && ms.Length > 0){context.Response.AddHeader("content-length",ms.Length.ToString());// Reset the current position in the MemoryStreamms.Position = 0;// .NET 4.0 and later method onlyms.CopyTo(context.Response.OutputStream);}Listing 5-20:Updated ProcessRequest() method using a WCF call to obtain the FILESTREAM data.Because the interface definition of IItemData matches that of the ItemData class weused to call, there is no difference in the code beyond the creation of the channel. Inessence, we have replaced the call to the constructor of the ItemData class with a WCFcall over TCP.Be sure to make the URI of the EndpointAddress in this code match up with theURI created in the console application. If you will be running this code actually acrossdifferent machines, replace localhost with the name or IP address of the computerrunning the WCF service. To avoid hard-coding the name or IP address of the server,the WCF client configuration can also be created in the web.config file, but that is notdemonstrated here.Testing the application from Visual StudioTo test this code from Visual Studio, first right-click the ConsoleHost project andselect Debug, Start new instance. This will start the console application and, with it,the WCF service. Next, right-click the web application and start a new instance of it, alsoin debug mode.215

Chapter 5: FILESTREAM with ASP.NET and SilverlightIf everything is set up correctly, the web application will make a remote call to the WCFservice to obtain the MemoryStream containing the image. Other than a possible slightdelay, the application is functionally the same as Lab 6. Architecturally, however, theapplication looks very different because the web application displaying the image nolonger relies on knowledge of the data store.If you have access to two computers (perhaps using virtualization), deploy the WCFconsole host application to the computer running your NorthPole database instanceand deploy the web application to the other computer. Be sure to open up the portnumber specified in the host's configuration file in the computer's firewall software.Lab 9: Playing a Video Stored in a FILESTREAMDatabase on a Silverlight ApplicationIn this lab we will create a Silverlight application that will play a video stored in aFILESTREAM database.To do this, we will:• create a new table in the NorthPole database and load a video• create a Silverlight application• create the video handler• add a MediaElement and configure it• run and test the application.216

Chapter 5: FILESTREAM with ASP.NET and SilverlightSetting up the dataFirst, we'll set up the table and data required for this lab. Create a table in the NorthPoledatabase to store the video files using the T-SQL code shown in Listing 5-21.CREATE TABLE Videos(VideoID UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,VideoName VARCHAR(100),VideoData VARBINARY(MAX) FILESTREAM)Listing 5-21:Creating a table to hold the video data.Now, we'll add some videos to the table. For this demo, I will use the sample video filethat comes with the Windows 7 operating system. Listing 5-22 loads a video file intothe table we just created. Make sure that you change the location of the file to pointto a .wmv file that exists on the local computer.-- Insert the data into the tableINSERT INTO Videos (VideoID, VideoName, VideoData)SELECT NEWID(), 'Windows7 Sample Video', CAST(bulkcolumnAS VARBINARY(MAX))FROM OPENROWSET(BULK'C:\Users\Public\Videos\Sample Videos\Wildlife.wmv',SINGLE_BLOB ) AS xListing 5-22:Loading a video into the table.Next, create a stored procedure that returns the logical path name of the FILESTREAMcolumn, along with the FILESTREAM transaction context (Listing 5-23).217

Chapter 5: FILESTREAM with ASP.NET and SilverlightCREATE PROCEDURE GetVideoInfo(@videoid UNIQUEIDENTIFIER)ASBEGINSELECTVideoData.PathName() AS filePath,GET_FILESTREAM_TRANSACTION_CONTEXT() AS txContextFROM VideosWHERE VideoID = @videoidENDListing 5-23:Creating a stored procedure to return the logical path andFILESTREAM transaction context.This procedure accepts a VideoID and returns the FILESTREAM information from theassociated record.Creating a Silverlight applicationThe next step is to create a Silverlight application. In Visual Studio, open the new projectpage by selecting File | New | Project (Figure 5-34).Figure 5-34:Creating a Silverlight application.218

Chapter 5: FILESTREAM with ASP.NET and SilverlightSelect Silverlight Application from the project templates and click OK to create a newSilverlight application project. Visual Studio prompts us to select the project type andSilverlight version (Figure 5-35).Figure 5-35:Selecting the Web project type and Silverlight Version.Creating a video handlerThe next step is to create a handler that can read the FILESTREAM data from thedatabase and server the client requests. As before, we'll use an ASPX page for simplicity,even though in the real world you would probably create an ASP.NET handler as a codeitem and register it in web.config.To create the video handler, right-click on the project in the Solution Explorer and selectAdd New Item. Select ASP.NET Web Form and name it getvideo.aspx. The code that219

Chapter 5: FILESTREAM with ASP.NET and Silverlightyou need to add into the web form is very much the same as we have seen in the previouslab, when we created a handler for the images:• change the name of the query string parameters and local variable• change the name of the stored procedure• change the content type returned by the handler.Listing 5-24 shows the code with the changes in bold and underlined.protected void Page_Load(object sender, EventArgs e){string VideoID;VideoID = Request.QueryString["VideoID"].ToString();//Create and open a database connectionSqlConnection cn = new SqlConnection();cn.ConnectionString =System.Configuration.ConfigurationManager.ConnectionStrings["NorthPoleDB"].ToString();cn.Open();//Create a Command objectSqlCommand cmd = new SqlCommand();cmd.CommandText = "GetVideoInfo";cmd.Connection = cn;cmd.CommandType = CommandType.StoredProcedure;cmd.Parameters.AddWithValue("@videoid", VideoID);//Begin a transactionSqlTransaction trn = cn.BeginTransaction("VideoTran");cmd.Transaction = trn;string filePath = string.Empty;byte[] txContext = null;try{//Execute the commandSqlDataReader reader = cmd.ExecuteReader();if (reader.Read())220

Chapter 5: FILESTREAM with ASP.NET and Silverlight{txContext = (reader["txContext"] as byte[]);filePath = reader["filePath"].ToString();}reader.Close();//Open the FILESTREAM data file for readingSqlFileStream fs = new SqlFileStream(filePath, txContext,FileAccess.Read);//Start transferring data into filestream data fileResponse.BufferOutput = false;Response.ContentType = "video/x-ms-wmv";BinaryWriter bw = new BinaryWriter(Response.OutputStream);const int bufferSize = 4096;byte[] buffer = new byte[bufferSize];int byteCount = fs.Read(buffer, 0, bufferSize);while ( byteCount == bufferSize ){bw.Write(buffer, 0, bytes);byteCount = fs.Read(buffer, 0, bufferSize);}}//Close the filesResponse.End();bw.Close();fs.Close();}catch(Exception ex){//Do some error handling}finally{// Commit the transaction and close connectioncmd.Transaction.Commit();cn.Close();}Listing 5-24:Code that serves video files stored in a FILESTREAM database.221

Chapter 5: FILESTREAM with ASP.NET and SilverlightMake sure that you add a connection string named NorthPoleDB to the web.config(Listing 5-25).Listing 5-25:Adding a connection string.Adding and configuring MediaElementNext, we will add a MediaElement to the Silverlight application. The MediaElementcomponent will be responsible for playing the video data that our handler will serve tothe Silverlight application. Open MainPage.xaml in the designer window and drag aMediaElement object into it. Resize it so that it will look like in Figure 5-36.Figure 5-36:Adding a MediaElement object.222

Chapter 5: FILESTREAM with ASP.NET and SilverlightThe next step is to set the source of the MediaElement to the video handler we createdin the previous step. Open the property page of the MediaElement, set the AutoPlayproperty to True and configure the Source property as demonstrated in Figure 5.37.Figure 5-37:Setting the Source property of the MediaElement.In this lab we are using a static URL as the source of the video, which can play onlyone video configured at design time. We can easily change it to dynamic value byprogrammatically setting the Source property at run time. Note also that the URLwill need changes after deploying the application on a web server.Let's take a look at the structure of the URL specified in the Source property. To buildthis URL, we need to know the port on which Visual Studio runs the built-in web server(http://localhost:36005) and the GUID value stored in the VideoID column of the rowthat contains the video to be played (videoid=696CEFAC-3DC1-4239-83A3-7836C5).By default, Visual Studio runs a web application using a built-in web server when you runa project from the IDE. The default configuration identifies a dynamic port and binds theweb server to it. We can configure the web application project to use a specific port fromthe Web tab of the project properties page (Figure 5-38).223

Chapter 5: FILESTREAM with ASP.NET and SilverlightFigure 5-38:Configuring a web project to run on a specific port whenexecuting from Visual Studio IDE.Running and testing the applicationThe final step is to run and test the Silverlight application. Build the project and runthe application. This will open the Silverlight application test page (which is part of theproject) in the default browser and load the application. The media element we added willinvoke the video handler, which will stream video data to the MediaElement. This, inturn, will play it on the browser (Figure 5-39).224

Chapter 5: FILESTREAM with ASP.NET and SilverlightFigure 5-39:Running the application.SummaryASP.NET and Silverlight are two of the most popular web application developmentplatforms available for .NET programmers today. The labs presented in this chapter werecreated with the intention of helping the ASP.NET and Silverlight developer communityto get started with developing web applications that deal with FILESTREAM data.In Lab 4, we created an ASP.NET web page which demonstrated how to create webapplications that allow the user to upload files into a FILESTREAM database. In Lab 5,we saw that web applications that deal with FILESTREAM storage might break whendeployed on the web server (IIS). Applications that deal with FILESTREAM should useWindows authentication when connecting to SQL Server. In most cases, the user accountunder which the application is configured on IIS may not have access to the SQL Serverdatabase, and causes an application failure. We saw how to fix this in Lab 5.225

Chapter 5: FILESTREAM with ASP.NET and SilverlightLab 6 demonstrated how to serve images stored in a FILESTREAM database to the webclients. For simplicity, we created an ASP.NET web form to do this. However, a moreappropriate way would be by using an ASP.NET handler, and the FILESTREAM-relatedcode that we wrote in the handler web form can be added to a generic handler with noadditional changes. Lab 7 used the web handler to display image files. Lab 8 then took thenext step and separated the implementation of the data store from the client. We alsoused an actual ASP.NET handler in that lab.In Lab 9, we saw how to create a Silverlight application that plays video files stored in aFILESTREAM-enabled SQL Server database. We created another handler web page forserving the video files stored in the database. A MediaElement component added to theSilverlight application invoked the video handler which streamed the video data to theclient web page.226

Chapter 6: FILESTREAM with SSIS andSSRSThis chapter will demonstrate how to access FILESTREAM data from SQL ServerBusiness Intelligence (BI) projects, namely with SQL Server Integration Services (SSIS)and SQL Server Reporting Services (SSRS). We will complete the labs below.• Lab 10: Loading BLOB values into a FILESTREAM column using SSIS.• Lab 11: Exporting BLOB values from a FILESTREAM column using SSIS.• Lab 12: Building a product catalog in SSRS that displays images stored ina FILESTREAM column.The focus of the labs is to demonstrate how to access FILESTREAM data from SSISand SSRS. In SSIS, we will use a Script task rather than a Data Flow task to transferFILESTREAM data. That way we can use the streaming APIs and increase performance. InSSRS, there are no special accommodations that can be made to use streaming access tothe FILESTREAM data.These labs do not even attempt to cover the core SSIS and SSRS features. For example, inthe case of SSIS, features like configurations and parameters are not used. In the case ofSSRS, no time is spent on designing a proper layout for the report. The goal of these labsis simply to provide you with basic examples that you can enhance further to fit into yourspecific application needs.227

Chapter 6: FILESTREAM with SSIS and SSRSFILESTREAM and SSISSSIS is a popular choice for moving data in and out of SQL Server. While it is quite easy tocreate SSIS packages that import and export SQL Server data, it is a little tricky to handlescenarios in which FILESTREAM values are involved. This is because, if we choose thecurrent "standard" connection type for SSIS applications (the OLEDB connection), thenwe are forced to access the FILESTREAM data using T-SQL which, as discussed, is notideal for performance reasons. Instead, in our labs, we'll use the ADO.NET connectionin our SSIS packages, allowing us to import and export FILESTREAM data using thestreaming APIs.Lab 10: Loading BLOB Values into a FILESTREAMColumn Using SSISIn this example, "North Pole Corporation" has a product catalog application that is usedto print occasional product catalogs. The catalog application has a very simple databasewith only one table (Items), which stores product information. This table is synchronizedwith the Products table in their main inventory management application using a nightlybatch process.The nightly batch process updates the Items table with the most recent informationabout the products that are currently available in the warehouse. It also copies the imagesof items into a file system folder and stores the path to the image files in the Items table.The catalog application uses the information in the Items table and the images stored inthe file system to build the catalogs.Listing 6-1 shows what the data in the Items table looks like. (To keep the examplesimple, I have omitted other columns such as SKU, Case Code, Price, and so on.)228

Chapter 6: FILESTREAM with SSIS and SSRSItemID ItemCode ItemName ImageFileName------ ---------- --------------- ----------------------1 1001 Microsoft Mouse C:\Images\IMG_5383.jpg2 3001 Dell Laptop C:\Images\IMG_5384.jpg3 4005 Apple iPad C:\Images\IMG_5385.jpg4 6009 Amazon Kindle C:\Images\IMG_5386.jpgListing 6-1:The data in the Items table.North Pole Corporation has recently decided to upgrade their catalog application toSQL Server 2008 R2. One of the reasons for the migration is to take advantage of theFILESTREAM storage feature introduced in SQL Server 2008. They want to store theimages of products in a FILESTREAM column for better manageability.The development team does not want to disturb the existing nightly job that synchronizesproduct information. They decide to add one more job that runs after the nightlysynchronization job. The new job is responsible for loading the images of products into aFILESTREAM column in the Items table.To achieve this, they identified the steps below.• Make appropriate database modifications to support storing FILESTREAM data:• enable the FILESTREAM feature on the SQL Server instance• add a FILESTREAM filegroup to the product catalog database• add a FILESTREAM column to the Items table• create the necessary stored procedures.• Create an SSIS package that loads the files into the FILESTREAM column.229

Chapter 6: FILESTREAM with SSIS and SSRSSetting up the sample databaseListing 6-2 rebuilds the sample database for use with this lab.USE masterGO-- -------------------------------------------------------- If the sample database exists, drop it and re-create it-- ------------------------------------------------------IF DB_ID('NorthPole') IS NOT NULLDROP DATABASE NorthPoleGOCREATE DATABASE NorthPole ON PRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter6\NorthPole.mdf') LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter6\NorthPole_log.ldf')GOListing 6-2:Rebuilding the sample database.Listing 6-3 creates the Items table and populates it with the sample data shown in Listing6-1. Note that you need to change the file names used in the script to point to image filesthat exist on your computer. You can download the images used in the example fromhttp://brurl.com/fs4. Download the file and extract into the c:\images folder.USE NorthPoleGOCREATE TABLE Items(ItemID INT IDENTITY ,ItemCode VARCHAR(10) ,ItemName VARCHAR(50) ,ImageFileName VARCHAR(100))230

Chapter 6: FILESTREAM with SSIS and SSRSGOINSERT INTO Items( ItemCode ,ItemName ,ImageFileName)SELECT '1001' ,'Microsoft Mouse' ,'C:\Images\IMG_5383.jpg'UNION ALLSELECT '3001' ,'Dell Laptop' ,'C:\Images\IMG_5384.jpg'UNION ALLSELECT '4005' ,'Apple iPad' ,'C:\Images\IMG_5385.jpg'UNION ALLSELECT '6009' ,'Amazon Kindle' ,'C:\Images\IMG_5386.jpg'SELECT * FROM Items/*ItemID ItemCode ItemName ImageFileName------ ---------- --------------- ----------------------1 1001 Microsoft Mouse C:\Images\IMG_5383.jpg2 3001 Dell Laptop C:\Images\IMG_5384.jpg3 4005 Apple iPad C:\Images\IMG_5385.jpg4 6009 Amazon Kindle C:\Images\IMG_5386.jpg*/Listing 6-3:Populating the sample table.We'll assume at this stage that the team has enabled the FILESTREAM feature on theSQL Server instance where the catalog application database is hosted. For a detailedexplanation of how to do this, see Appendix A: Configuring FILESTREAM on a SQL ServerInstance. Once FILESTREAM has been enabled in the current database, we can add a newFILESTREAM filegroup to the current database (Listing 6-4).231

Chapter 6: FILESTREAM with SSIS and SSRSALTER DATABASE NorthPoleADD FILEGROUP NORTHPOLE_FSCONTAINS FILESTREAM ;GOALTER DATABASE NorthPoleADD FILE (NAME = NorthPoleFS,FILENAME = 'C:\ArtOfFS\Demos\Chapter6\NorthPole_fs')TO FILEGROUP NorthPole_fs ;GOListing 6-4:Adding a FILESTREAM filegroup to the database.As we discussed previously, there are a number of prerequisites for adding a FILESTREAMcolumn to an existing table; for full details on what is involved, refer back to Chapter 2.Otherwise, go ahead and use the script in Listing 6-5 to add the column.USE NorthpoleGO-- Add a UNIQUEIDENTIFIER columnALTER TABLE ItemsADD ItemGuid UNIQUEIDENTIFIER NOT NULLDEFAULT (NEWID())-- Add a UNIQUE constraintALTER TABLE ItemsADD CONSTRAINT IX_ItemGuid UNIQUE(ItemGuid)-- Create ROWGUIDCOLALTER TABLE ItemsALTER COLUMN ItemGUID ADD ROWGUIDCOL-- Add the FILESTREAM columnALTER TABLE [dbo].[Items]ADD ItemImage VARBINARY(MAX) FILESTREAM NULLListing 6-5:Adding a FILESTREAM column to a table.232

Chapter 6: FILESTREAM with SSIS and SSRSWhen you have added the FILESTREAM column, run a SELECT query on the Itemstable; you will see results similar to Listing 6-6. (Of course, the ItemGuid andImageFileName columns might show a different value.)It.. ItemName ImageFileName ItemGuid ItemImage---- -------- ------------------ ----------------- ---------1001 Micros.. C:\..\IMG_5383.jpg 53C15E09-79CC-4.. NULL3001 Dell L.. C:\..\IMG_5384.jpg AFA5D604-68EA-4.. NULL4005 Apple .. C:\..\IMG_5385.jpg B6A6EE8D-7ED6-4.. NULL6009 Amazon.. C:\..\IMG_5386.jpg 6EFF334C-B2C1-4.. NULLListing 6-6:The Items table with the FILESTREAM column.What we are planning to do in this lab is to run a loop over all the rows in the Itemstable to read the content of each image file specified in the ImageFileName column,and then store it in the FILESTREAM column, ItemImage. Listing 6-7 creates the storedprocedure that the SSIS package will use to retrieve item information.USE NorthPoleGOCREATE PROCEDURE [dbo].[GetItems]ASSELECT ItemIDFROM Items ;Listing 6-7:Creating a stored procedure to retrieve the ItemID column of all the rows.The above stored procedure returns the ItemID of all the rows to be processed. TheSSIS package will loop through each row returned by this procedure and execute theFILESTREAM-related code to load the images into the FILESTREAM column of the table.All FILESTREAM operations should take place within the context of a SQL Servertransaction. You might want to do it in one of the two ways illustrated below.233

Chapter 6: FILESTREAM with SSIS and SSRS1. Start a SQL Server transaction at the beginning, process all the rows and commit.2. Start a SQL Server transaction before processing each row and commit when thegiven row is processed.We will take Option 2 for this lab because, with this approach, locks are placed on eachrow only for a minimal period of time. A row will be locked only when the processingstarts and the lock will be released as soon as the given row is processed.To load each image file into the FILESTREAM column, we will need the pieces ofinformation below:• the FILESTREAM transaction context• the FILESTREAM logical path (the PathName() value)• the name and location of the image file on the disk (to be loaded into the FILESTREAMcolumn).We'll create another stored procedure that accepts the ItemID and returns the aboveinformation. Listing 6-8 shows how to create the stored procedure.USE [NorthPole]GOCREATE PROCEDURE [dbo].[GetItemInfo] ( @ItemID INT )AS-- ------------------------------------------------------------------- If the FILESTREAM column is NULL, we cannot retrieve PathName().-- In such a case, we need to use a dummy value-- -----------------------------------------------------------------UPDATE dbo.itemsSET ItemImage = 0xWHERE ItemID = @ItemIDAND ItemImage IS NULL234

Chapter 6: FILESTREAM with SSIS and SSRS-- ------------------------------------------------------------------ SELECT the required information-- ----------------------------------------------------------------SELECT TOP 1GET_FILESTREAM_TRANSACTION_CONTEXT() AS FSContext ,ItemImage.PathName() AS FSPath ,ImageFileNameFROM ItemsWHERE ItemID = @ItemIDListing 6-8:Creating a stored procedure to retrieve information about the item.Creating the SSIS packageWe're now ready to create the SSIS package that loads the files into the FILESTREAMcolumn. We'll proceed as shown below.1. Create an SSIS project.2. Add an Execute SQL task to retrieve all the rows from the Items table.3. Add a Foreach Loop container, which will help us to run a loop over all the rowsreturned by the previous task.4. Add a Script task into the Foreach Loop container, in which we will write the .NETcode to load the content of the disk file into the FILESTREAM column of the tableusing FILESTREAM Managed APIs.Creating a new SSIS projectRun SQL Server Business Intelligence Development Studio, and create a new SSISproject. The New Project dialog box is shown in Figure 6-1. Enter a Name, Location, andSolution Name, and click OK.235

Chapter 6: FILESTREAM with SSIS and SSRSFigure 6-1:Creating a new SSIS project.When you create a new SSIS project, an empty SSIS package is automatically added tothe project. We'll rename the package FILESTREAM Stream In.dtsx; right-click on thepackage name in the Solution Explorer and select Rename (Figure 6-2).Figure 6-2:Renaming the SSIS package.236

Chapter 6: FILESTREAM with SSIS and SSRSWhen renaming a package file, a message box will verify whether you would like torename the package object as well. Select Yes, and continue.Figure 6-3:Package object rename prompt.Creating an ADO.NET connection to the NorthPoledatabaseNext, let's create a connection to the catalog application database. SSIS ConnectionManager allows you to create different types of connections. Streaming access to theFILESTREAM data should be done within the context of a SQL Server transaction.Therefore, we will choose an ADO.NET connection instead of the usually preferred OLEDB connection type.Right-click on the Connection Managers window at the bottom of the SSIS Designerand select New ADO.NET Connection. This opens the Configure ADO.NET ConnectionManager dialog box, where we can create and configure ADO.NET connections (Figure 6-4).Figure 6-4:The Configure ADO.NET Connection Manager.237

Chapter 6: FILESTREAM with SSIS and SSRSClick the New button to create a new connection. This opens the Connection Managerdialog box (Figure 6-5). Make sure that the Provider selected is .Net Providers\SqlClientData Provider. Then specify the server and database names. Ensure you connect usingWindows authentication, because Win32 and Managed API access to the FILESTREAMdata is possible only with Windows authentication.Figure 6-5:The Connection Manager dialog box.You can click Test Connection to check that the connection information you haveentered is correct. Click OK to save the information, and then click OK again on theparent dialog box to save the configuration changes and return to Package Designer. Youwill see a new connection object in the Connection Managers window (Figure 6-6).Figure 6-6:The new connection in the Connection Managers window.238

Chapter 6: FILESTREAM with SSIS and SSRSFor the sake of simplicity, we'll rename the connection NorthPole, so right-click onthe Connection object and select Rename, or simply select the object and press F2(Figure 6-7).Figure 6-7:Renaming the connection object.Adding an Execute SQL taskThe next step is to add an Execute SQL task. You can find it in the toolbox under ControlFlow Items; drag and drop it onto the Package Designer window (Figure 6-8).Figure 6-8:Adding the Execute SQL task.Now we'll configure the Execute SQL task to retrieve the item information. First, we'lluse the stored procedure to retrieve the ItemID from the Items table. We only need theItemID column because the information retrieved at this stage is used only for iterating239

Chapter 6: FILESTREAM with SSIS and SSRSover the rows; the Script task that we will create later will fetch additional informationbased on the ItemID being processed at each iteration through the loop.Now, in the Package Designer, double-click on the Execute SQL task and configure thefollowing properties on the General tab of the Execute SQL Task Editor (Figure 6-9):• Name: Get Item List• Description: Get list of itemsto be processed• Result Set: Full result set• Connection Type: ADO.NET• Connection: NorthPole• SQLSourceType: Direct input• SQL Statement: GetItems• IsQueryStoredProcedure: True.Figure 6-9:Configuring the properties of the Execute SQL task.240

Chapter 6: FILESTREAM with SSIS and SSRSWe need to store the result set returned by the stored procedure in a local variable, anditerate over it. On the Result Set page, click the Add button to add a new result set, andthen change the Result Name to 0 to map it to the first result set returned by the storedprocedure (Figure 6-10).Figure 6-10:Configuring the Result Set.241

Chapter 6: FILESTREAM with SSIS and SSRSFrom the Variable Name drop-down list, select to create a new variableto hold the result set returned by the stored procedure. This opens the Add Variabledialog box (Figure 6-11).Change the Name to items and Value type to Object. The Foreach Loop that we willadd to the package in the next step will iterate over this variable to process all the rowsreturned by the stored procedure.Figure 6-11:Adding the items variable to the Execute SQL task.242

Chapter 6: FILESTREAM with SSIS and SSRSClick OK to create the new variable. Make sure items is selected under Variable Name inthe Result Set page (Figure 6-12), and click OK to save the configuration.Figure 6-12:Mapping the new variable to the Result Set.Adding a Foreach Loop containerThe next step is to add a Foreach Loop container to the package. You can find it in thetoolbox under Control Flow Items; drag and drop it onto the Package Designer window,then connect it with the Execute SQL task (Figure 6-13).Figure 6-13:Connecting the Get Item List and Foreach Loop Container objects.243

Chapter 6: FILESTREAM with SSIS and SSRSDouble-click the header of the Foreach Loop container to open its configuration dialogbox (Figure 6-14). Then, in the Collection tab, set the following properties:• Enumerator: Foreach ADO Enumerator• ADO object source variable: User::items• Enumeration Mode: Rows in the first table.Figure 6-14:Configuring the properties of the Foreach Loop.This configuration asks the enumerator to loop through all the rows in the first tablecontained in the object variable User::items. We have just configured the Execute SQLtask to assign the result set returned by the stored procedure into this variable, thereforethe Foreach Loop will iterate over the rows returned by the stored procedure.244

Chapter 6: FILESTREAM with SSIS and SSRSFor each iteration of the loop, we will execute the .NET code that loads the image file intothe FILESTREAM column of the Items table. The code that we execute within the loopneeds to know the ItemID of the row currently being processed. This can be achieved bycreating one more user variable, into which the Foreach Loop component can store theItemID value of the current row. You can do this from the Variable Mappings tab of theForeach Loop Editor (Figure 6-15).Figure 6-15:Configuring the Variable Mappings for the Foreach Loop.From the Variable drop-down list, select to create a new variable andmap it to the result set. This opens the Add Variable dialog box again; change the Nameto itemID, the Value type to Int32, and the Value to 0 (Figure 6-16).245

Chapter 6: FILESTREAM with SSIS and SSRSFigure 6-16:Adding the itemID variable to the Foreach Loop.Click OK to save the changes. Make sure that the newly created variable is selected in theVariable column of the Variable Mappings page (Figure 6-17), and then click OK to savethe configuration.Figure 6-17:Mapping the new itemID variable.246

Chapter 6: FILESTREAM with SSIS and SSRSSet the Index column to 0 to tell the Foreach Loop container to store the value from thefirst column of the result set in the itemID variable on each iteration. Click OK to savethe configuration, and return to the Package Designer.Adding a Script task to perform FILESTREAM operationsThe next step is to add a Script task to the package, which we will use to write the .NETcode to perform FILESTREAM operations using the FILESTREAM Managed API. You canfind the Script task in the toolbox under Control Flow Items. Drag and drop it onto thePackage Designer window, inside the Foreach Loop container (Figure 6-18).Figure 6-18:Adding the Script Task inside the Foreach Loop container.Next, we will write some .NET code to perform the FILESTREAM operations. Dependingupon your preference, you can choose C# or VB, but note that this decision must be madebefore you make any changes to the script task; once the script task has been modified,you cannot change the .NET language associated with it.247

Chapter 6: FILESTREAM with SSIS and SSRSDouble-click on the Script Task to configure the properties of the script and start writingthe FILESTREAM handling code (Figure 6-19).Figure 6-19:Configuring the Script Task.This property page allows you select the script language you wish to use. For the purposeof this example, let us go ahead with C#.Once configured, the Script task will be executed at each iteration of the Foreach Loop.For each iteration, the Foreach Loop stores the value of the ItemID column in theitemID variable. We need to pass the itemID variable to the Script task so that we canwrite the code to process the item specified by the variable.Click on the button to the right of ReadOnlyVariables to open the Select Variables dialogbox. Then, locate the itemID variable and select it (Figure 6-20).248

Chapter 6: FILESTREAM with SSIS and SSRSFigure 6-20:Selecting the itemID variable.Click OK to save the changes. On the parent dialog box, click the Edit Script button. Thiswill open the C# code editor in a new Visual Studio window.Writing the FILESTREAM processing codeThe next step is to write the code needed to perform the FILESTREAM operations. This isthe most important part of this lab. We will modify the template code given by the Scripttask to suit our requirements.To start with, we need to add a few additional namespace declarations to the existingnamespace declarations, as shown in Listing 6-9.249

Chapter 6: FILESTREAM with SSIS and SSRSusing System.Data.SqlClient;using System.Data.SqlTypes;using System.IO;Listing 6-9:Additional namespace declarations.Next, we need to create a connection to the target database. We already created aconnection object earlier in this lab. Listing 6-10 demonstrates how to obtain aSqlConnection object from the Connection Manager we created using thePackage Designer.SqlConnection con;con = Dts.Connections["NorthPole"].AcquireConnection(null) as SqlConnection;Listing 6-10:Retrieving a SqlConnection object from the Connection Manager.Add the above code (and all the sample codes provided in Listings 6-11 to 6-16) intothe void main() method the Package Designer auto-generated, at the location whereit says://TODO: Add your code hereNext, we need to determine the ItemID to be processed in the current iteration of theloop. We already created a package variable to which the ItemID will be assigned ateach iteration of the loop. What we are doing here is assigning the value of the packagevariable ItemID to a local variable ItemID in the script for further use.Listing 6-11 shows how to access the ItemID variable from the Script task.Int32 ItemID = (int) Dts.Variables["User::itemID"].Value;Listing 6-11:Reading the value of a package variable from the Script task.250

Chapter 6: FILESTREAM with SSIS and SSRSSo far, we have obtained a Connection object pointing to the target database, andidentified the ID of the item to be processed.The approach presented in this lab does a round trip to the server at each iteration of theloop. This is to keep the duration of FILESTREAM transactions minimal. Remember thatall access to the FILESTREAM data should be made within the scope of a transaction.An alternative approach would be to start a transaction at the beginning of the processand fetch the PathName() of all the rows with a single database call. In this case, theadditional round trips to the server can be avoided. However, this approach will keep atop level transaction open until the process completes, which could block the requestsfrom other connections.We now need a few local variables to store the FILESTREAM transaction context, thelogical path name of the FILESTREAM data value, and the path to the disk file that wewant to load into the FILESTREAM column, as shown in Listing 6-12.byte[] FSContext = null;string FSPathName = string.Empty;string FileName = string.Empty;Listing 6-12:Setting up the local variables.Next, we'll start a database transaction and execute the stored procedure (Listing 6-13).SqlCommand cmd = new SqlCommand();cmd.Connection = con;cmd.CommandText = "GetItemInfo";cmd.CommandType = CommandType.StoredProcedure;cmd.Parameters.AddWithValue("@ItemID", ItemID);SqlTransaction transaction = con.BeginTransaction("ItemTran");cmd.Transaction = transaction;SqlDataReader reader = cmd.ExecuteReader();Listing 6-13:Starting a database transaction and executing the stored procedure.251

Chapter 6: FILESTREAM with SSIS and SSRSThe stored procedure will return the three critical pieces of information that we needed.The code in Listing 6-14 reads these values and assigns them to the local variables wecreated earlier.reader.Read()FSContext = (reader["FSContext"] as byte[]);FSPathName = reader["FSPath"].ToString();FileName = reader["ImageFileName"].ToString();reader.Close();Listing 6-14:Reading the values returned by the stored procedure.Listing 6-15 opens the FILESTREAM data file for writing, and the image file for reading. Itthen reads from the disk file and writes into the FILESTREAM column.// Open the FILESTREAM data file for writingSqlFileStream fs = new SqlFileStream(FSPathName, FSContext,FileAccess.Write);// Open the disk file for readingFileStream localFile = new FileStream(FileName,FileMode.Open, FileAccess.Read);// Start transfering data from disk file to FILESTREAM data fileBinaryWriter bw = new BinaryWriter(fs);const int bufferSize = 4096;byte[] buffer = new byte[bufferSize];int bytes = localFile.Read(buffer, 0, bufferSize);while (bytes > 0){bw.Write(buffer, 0, bytes);bw.Flush();bytes = localFile.Read(buffer, 0, bufferSize);}Listing 6-15:Reading from the disk file and writing to the FILESTREAM column.252

Chapter 6: FILESTREAM with SSIS and SSRSIt's now time to close the objects to release the resources and commit the transaction.Before we exit, we also set the TaskResult property of the Dts object to Success, toindicate that the process was successful (Listing 6-16).// Close the filesbw.Close();localFile.Close();fs.Close();// Commit the transactioncmd.Transaction.Commit();Dts.TaskResult = (int)ScriptResults.Success;Listing 6-16:Closing the files and committing the transaction.To keep the code listing simple, we have not included any error-handling code. It isrecommended that you wrap your code within a try-catch block to handle anypossible errors. The above snippet assumes that the code executes successfully andtherefore commits the transaction and sets the task result to Success. Note that ifan error occurs after the transaction is opened, you should explicitly roll back thetransaction from your catch/finally block. You may also want to set the task statusto Failure as demonstrated in Listing 6-17 to indicate that the script did not succeed.cmd.Transaction.Rollback();Dts.TaskResult = (int)ScriptResults.Failure;Listing 6-17:Rolling back the transaction in the event of an error.The complete listing of the C# source code is available in the download. The codelisting for VB is also supplied; if you want to write your code using VB, remember youmust change the ScriptLanguage in the property page of the Script task before the Scripttask is modified.253

Chapter 6: FILESTREAM with SSIS and SSRSCongratulations! We have successfully completed the first SSIS lab. Execute thepackage and verify that it executes without error and the FILESTREAM columns areupdated correctly.The simplest way to verify that the image files are successfully loaded into theFILESTREAM column is to run the query in Listing 6-18 and look at the size of the BLOBvalue stored in each row.SELECTItemCode,ImageFileName,DATALENGTH(ItemImage) AS FileSizeFROM Items/*ItemCode ImageFileNameFileSize-------- ---------------------- --------1001 C:\Images\IMG_5383.jpg 53033001 C:\Images\IMG_5384.jpg 53444005 C:\Images\IMG_5385.jpg 74896009 C:\Images\IMG_5386.jpg 6470*/Listing 6-18:Verifying that image files are successfully loaded into the FILESTREAM column.You can match the FileSize value returned by the query with the size of the disk file toverify that the image files have been successfully loaded into the FILESTREAM column.We will also cross-verify this in the next lab, where we will perform the reverse operation.We will export the BLOB values from the FILESTREAM column into a series of disk files.Debugging the codeIf you want to debug your code, you can set break points within the code and theexecution will stop there. If your script task has a break point, it will show a red circle inthe designer window as shown in Figure 6-21.254

Chapter 6: FILESTREAM with SSIS and SSRSFigure 6-21:A break point in the Script Task.If you are on a 64-bit computer, you may not be able to debug the package. From theDebugging tab in the properties of the project, you need to change the Run64BitRuntimeproperty to False (Figure 6-22).Figure 6-22:Changing the Debug Options in the project's properties.255

Chapter 6: FILESTREAM with SSIS and SSRSLab 11: Exporting BLOB Values from aFILESTREAM Column Using SSISIn the previous lab, we saw how to load the content of disk files into the FILESTREAMcolumn of a table using an SSIS package. In this lab, we will do the reverse of this; we willcopy the BLOB values from the FILESTREAM column to disk files.Creating the SSIS packageMost of the code for this lab is the same as for the previous lab, so an easy way to start thislab is to make a copy of the SSIS package we created in Lab 10, and modify it.Make a copy of the FILESTREAM Stream In.dtsx package, and then rename it toFILESTREAM Stream Out.dtsx.Setting the output folderWe'll assume that the content of the FILESTREAM column needs to be copied to aspecific location on the local disk. We'll keep the name of the target location in a variable,instead of hard-coding it in our script. Select the Variables option from the SSIS menu tocreate a new variable. In the Variables dialog box, click the New Variable button to add anew variable to the list of existing variables (Figure 6-23).256

Chapter 6: FILESTREAM with SSIS and SSRSFigure 6-23:Creating a new variable.When the variable has been added, edit the information as follows:• Name: outputFolder• Scope: FILESTREAM Stream Out• Data Type: String• Value: c:\tempThe Execute SQL taskThe Execute SQL task does not need any changes. We will use the same stored procedurethat we used in Lab 10, which returns a result set containing the ItemID of all the rowsto be processed.The Foreach Loop containerThe Foreach Loop container does not need any changes either; it will iterate over eachrow returned by the Execute SQL task.257

Chapter 6: FILESTREAM with SSIS and SSRSThe Script taskThe Script task needs a lot of reworking. Instead of copying BLOB data from the disk fileinto the FILESTREAM column, the version that we need to create now will read from theFILESTREAM column and write into disk files.The new variable that we just added needs to be passed into the script. In the Script TaskEditor, for ReadOnlyVariables, select outputFolder (Figure 6-24).Figure 6-24:Adding a variable to the Script Task.The script also needs a number of changes. To start with, we have to retrieve the valueof the package variable outputFolder and store it into a .NET string data type variable(Listing 6-19).string OutputFolder = Dts.Variables["User::outputFolder"].Value.ToString();Listing 6-19:Retrieving and storing the outputFolder variable.258

Chapter 6: FILESTREAM with SSIS and SSRSIn this lab, the FILESTREAM file must be opened for reading, and the disk file needs to becreated and opened for writing.The data stored in the table contains the full path and file name, for example:c:\filestream\images\abc.jpeg. We created a package variable earlier to store the locationin which the image files need to be created, so we now need to build a new path basedon the file path stored in the table and the target folder location stored in the packagevariable. Listing 6-20 shows how to do this.// Open the FILESTREAM data file for readingSqlFileStream fs = new SqlFileStream(FSPathName, FSContext,FileAccess.Read);// Open the disk file for writingstring OnlyFilename = System.IO.Path.GetFileName(FileName);FileStream localFile = new FileStream(OutputFolder + OnlyFilename,FileMode.Create, FileAccess.Write);// Note that the BinaryWriter should be initialized with the// localFile objectBinaryWriter bw = new BinaryWriter(localFile);Listing 6-20:Building a path to the target location.When you have completed the changes, execute the SSIS package and verify that theFILESTREAM data is correctly exported to the target location specified in the package.The complete listings for the C# and VB source code to export BLOB valuesfrom a FILESTREAM column into disk files in the specified location are availablein the download.259

Chapter 6: FILESTREAM with SSIS and SSRSLab 12: Displaying FILESTREAM Data in SSRSReportsImages stored in the FILESTREAM column of a SQL Server database can be displayedusing SSRS reports. You can display an image control to render the images on the report.We have discussed, several times, that accessing FILESTREAM data through T-SQL ismuch less efficient than using the streaming APIs (Win32 or Managed APIs) provided bySQL Server. However, SSRS does not provide a way to access the FILESTREAM data usingthe streaming APIs. All access to the FILESTREAM data must go through T-SQL, and theFILESTREAM columns must be accessed as though they are VARBINARY(MAX) values.So, when creating a report that uses images stored in the FILESTREAM column of a SQLServer database, you do nothing that is FILESTREAM-specific.In Labs 9 and 10, we looked at the example of a product catalog application used by NorthPole Corporation and we saw how to import and export FILESTREAM data using SSIS.Now let's look at another requirement that is part of the product catalog applicationproject: building the catalog.In this lab, we'll create a SSRS report that will list the items in the catalog applicationdatabase along with a thumbnail image of the items. The images are stored in theFILESTREAM column of the Items table. Note that this lab focuses only on the basicreport design features required for completing the walk-through; detailed coverage of theSSRS features is beyond the scope of this book.To complete this lab we will:• create a new SSRS project• create a data source to connect to thecatalog application database• design the report• preview the report.• create a dataset to retrieveproduct information260

Chapter 6: FILESTREAM with SSIS and SSRSIn a typical report development environment you would go further, and deploy thereport, but this and all the other steps in a typical report development cycle remain thesame as for a non-FILESTREAM column, and therefore they are not discussed in detailin this lab.Creating an SSRS report projectTo create a new SSRS project, start Business Intelligence Development Studio and selectNew Project from the File menu (Figure 6-25).Figure 6-25:Creating a new SSRS project.261

Chapter 6: FILESTREAM with SSIS and SSRSFrom the Business Intelligence Projects, select Report Server Project. Click OK to createan empty solution and project. Once the project has been created, the next step is to adda new report to the project. In the Solution Explorer, right-click on the Reports node andselect Add | New Item. Then, from the Report Project templates, select Report (Figure6-26).Figure 6-26:Creating a new report.Name the report ItemList.rdl, and then click Add to add the new, empty report to ourproject. We are now ready to get started.262

Creating a data sourceChapter 6: FILESTREAM with SSIS and SSRSThe next step is to create a data source pointing to the database of the product catalogapplication. Note that, in most real world reporting service projects, you will createshared data sources and shared datasets so that the object definitions can be reused.However, for this example we will specify an embedded connection.In the Report Data window, right-click the Data Sources node and select Add DataSource. When the Data Source Properties dialog box opens, either enter the connectionstring, or click Edit to build a connection string (Figure 6-27). Specify the SQL Serverinstance that hosts the product catalog database. When we created the data source forour SSIS lab, we were very careful to use Windows authentication in order to be ableto access the data using the streaming APIs. Since SSRS does not support accessingthe FILESTREAM data using Win32 or Managed APIs, we are free to choose betweenWindows authentication and SQL Server authentication.Figure 6-27:Creating the data source.Click OK on the Data Source Properties dialog box to save the newly created data sourceand return to the Report Designer.263

Creating a datasetChapter 6: FILESTREAM with SSIS and SSRSThis is the first step in which we are going to do something that touches theFILESTREAM data: we'll create a dataset that fetches the product information and imagesfrom the Items table.To begin, right-click on the Datasets node in the Report Data window and select AddDataset in order to open the Dataset Properties dialog box (Figure 6-28). We'll configurethe dataset as an embedded dataset – fill out the properties as follows:• Name: ProductCatalog• Type: Use a dataset embedded in my report• Data source: NorthPole• Query type: Text• Query: SELECT ItemCode, ItemName, ItemImage FROM itemsFigure 6-28:Adding a dataset to retrieve the product information.264

Chapter 6: FILESTREAM with SSIS and SSRSClick OK to save the dataset properties and return to the Report Designer. In the ReportData explorer, you will see the new dataset along with the data columns (Figure 6-29).Figure 6-29:The Report Data explorer showing the new dataset.We will use this dataset to build our product catalog.Designing the reportThe next step is to design the report. What we need is a tabular representation of the iteminformation. One way to achieve this is by using a Table control. In the Report Itemstoolbox, locate the Table control and then drag and drop it into the Report Designer. Thisadds an empty table to the report.Now we'll bind the columns to the columns returned by our dataset. Drag and drop theitemcode and itemname columns from the Report Data explorer into the first andsecond columns, respectively, of the table control (Figure 6-30).265

Chapter 6: FILESTREAM with SSIS and SSRSFigure 6-30:Binding the columns returned by the dataset to the Table control.In the third column, we will display a thumbnail image of the product, so add an Imagecontrol to the third column: Locate the Image control in the toolbox, then drag and dropit into the third column of the Table control. When you drop an image control, it opensthe Image Properties dialog box, shown in Figure 6-31 (if it does not open, right-click onthe image control and select Image Properties), which needs to be configured as follows:• Name: ItemImage• Select the image source: Database• Use this field: ItemImage• Use this MIME type: select the type of image – in this example, we used image/jpeg266

Chapter 6: FILESTREAM with SSIS and SSRSFigure 6-31:Setting the image properties.With this, we have completed the report design. You can improve the appearance bymodifying the display attributes such as font, size, border, background, color, and so on,as required.Figure 6-32:The completed report design.267

Previewing the reportChapter 6: FILESTREAM with SSIS and SSRSI'm sure you will agree that this is the easiest part of the process. If the report is designedcorrectly and the configuration is correct, it is just a matter of clicking the Preview tab(Figure 6-33).Figure 6-33:Previewing the report.The goal of this lab is to help you get started with building SSRS reports that use imagesstored in FILESTREAM columns. Remember that you are accessing the FILESTREAMdata over T-SQL, and the performance benefits offered by I/O streaming are not availablewhen using FILESTREAM data in SSRS reports.268

Chapter 6: FILESTREAM with SSIS and SSRSSummaryThe chapter presented three labs to help you to get started with creating SSRS and SSISapplications that deal with FILESTREAM data.FILESTREAM columns can be accessed using the streaming APIs from SSIS using a Scripttask, which means that you can write.NET code that uses the Managed APIs to read andwrite FILESTREAM data. The first two labs demonstrated how to import and exportFILESTREAM data using SSIS.Though FILESTREAM columns can be accessed from SSRS, it does not provide a wayto use the Win32 or Managed APIs; all access to the FILESTREAM data is done throughT-SQL. An SSRS report must access the FILESTREAM columns as though they areVARBINARY(MAX) columns, as we saw in Lab 12.269

Chapter 7: FILESTREAM DatabaseAdministrationThe previous chapters have discussed development with the FILESTREAM feature inquite some detail. Here, we begin a sequence of four chapters dedicated to the administrationand management of FILESTREAM-enabled databases. In this chapter, we'llexamine some special considerations for FILESTREAM-enabled databases, and howthe presence of FILESTREAM data might affect common database administrative tasks.Specifically, we will cover the points below.• Transaction isolation levels – The behavior of FILESTREAM-enabled databases undereach of the available levels, and where and how this differs from standard databases.• Moving FILESTREAM-enabled databases from one location to another – The maindifference is the need to move the FILESTREAM data container as well as data andlog files.• FILESTREAM garbage files – Possible issues associated with creation of an excessivenumber of FILESTREAM disk files, due to frequent modification of FILESTREAM data,and how SQL Server manages garbage collection of these files.• Common causes of a corrupt FILESTREAM database – And how to use DBCC checksto examine and possibly repair damaged databases.• Coping with growth of FILESTREAM databases – How to move FILESTREAMdata partially or fully to new disk volumes, by rebuilding a clustered index in a newfilegroup, and by partitioning FILESTREAM tables.• Migrating FILESTREAM data – Moving large databases using scripting, or theDatabase Publishing Wizard, in cases where restoring a backup is not feasible.270

Chapter 7: FILESTREAM Database AdministrationFILESTREAM and Database Transaction IsolationLevelsWith SQL Server 2008, it was not possible to enable the snapshot isolation level fordatabases that contained FILESTREAM data. Hence, it is only possible to operate suchSQL 2008 FILESTREAM databases using one of the standard ANSI transaction isolationlevels (READ COMMITTED, READ UNCOMMITTED, REPEATABLE READ, SERIALIZABLE).However, in SQL Server 2008 R2, and later, the snapshot isolation level has beenextended to support FILESTREAM data.If the row to be accessed is not locked by another process, FILESTREAM data can beaccessed for reading or writing under all the transaction isolation levels.If the row being accessed is locked (shared or exclusive mode) by another process, a writeoperation will fail under all the transaction isolation levels.If the row being accessed is locked in Shared (S) mode by another process, FILESTREAMdata can be accessed for reading under all the transaction isolation levels. However, ifthe row is locked in Exclusive (X) mode by another process, FILESTREAM data can beread only using the snapshot isolation level, and you will get the previous version of theFILESTREAM data value, if the other process has modified the FILESTREAM value.READ COMMITTED, REPEATABLE READ andSERIALIZABLE isolation levelsThe behavior of FILESTREAM data is the same for READ COMMITTED, REPEATABLEREAD and SERIALIZABLE. If the row that contains the required FILESTREAM value isexclusively locked by another process, a call to PathName() to read this data, or write toit, will be blocked.271

Chapter 7: FILESTREAM Database AdministrationIf the X lock is held for longer than the timeout duration, an error will be returned suchas the following (when using ADO.NET):Timeout expired. The timeout period elapsed prior to completion of the operation or theserver is not responding.If the row is not locked at the time of retrieving the PathName() value and theFILESTREAM transaction context, but another process obtains an exclusive lock onthe row right after that and before the Win32 API call is made, the attempt to open theFILESTREAM data file for read or write operation will fail with the following message:The process cannot access the file specified because it has been opened in anothertransaction.If the row that contains the FILESTREAM value is locked in Shared (S) mode, a call toPathName() will succeed. In such a case, FILESTREAM data will be available for reading.However, if an attempt is made to write to the FILESTREAM data file, the operation willfail with the same message.READ UNCOMMITTED isolation levelUnder the READ UNCOMMITTED isolation level, a call to PathName() returns NULL if therow is exclusively locked by another process, and neither read nor write operations can beperformed on the FILESTREAM cell using Win32/Managed APIs or TSQL.If there is a shared lock on the row, PathName() returns the correct value, and you canaccess the FILESTREAM data for reading. If you attempt to open the FILESTREAM datafile for writing, SQL Server will generate the same error as seen previously.272

Chapter 7: FILESTREAM Database AdministrationSnapshot isolation levelThis new versioning-based isolation was introduced in SQL Server 2005 but has onlybeen available since SQL Server 2008 R2 for FILESTREAM databases. When a database isoperating in snapshot isolation, a client application can read FILESTREAM values storedon rows that are exclusively locked by another process, using T-SQL or the Win32 API; inother words, unlike with the standard transaction isolation levels, a call to PathName() isnot blocked in this case. If the other transaction has modified the FILESTREAM value, theversion of the FILESTREAM data file returned will be as it existed at the time the currenttransaction started.Note that for databases containing only relational data, the snapshot isolation levelintroduces new possible new SNAPSHOT modes of operation:• SNAPSHOT mode – queries return committed data as of the beginning of thetransaction• READ_COMMITTED_SNAPSHOT mode – queries return committed data as of thebeginning of the current statement.However, for FILESTREAM-enabled databases with READ_COMMITTED_SNAPSHOTmode enabled, data will be returned as it existed at the start of the transaction. Inother words, there is effectively only one mode of snapshot isolation available forFILESTREAM databases.Note that snapshot isolation level is disabled by default; the query in Listing 7-1 willconfirm whether or not it is enabled on a named database.273

Chapter 7: FILESTREAM Database AdministrationSELECTsnapshot_isolation_state_descFROM sys.databasesWHERE name = DB_NAME()/*snapshot_isolation_state_desc------------------------------OFF*/Listing 7-1:Determining whether snapshot isolation level is enabled.If the snapshot isolation level is disabled (OFF), we can enable it by running the queryin Listing 7-2. Note, again, that SQL Server 2008 does not allow you to enable snapshotisolation level on databases having FILESTREAM filegroups. This restriction has beenremoved in SQL Server 2008 R2.ALTER DATABASE NorthPoleSET ALLOW_SNAPSHOT_ISOLATION ONListing 7-2:Enabling snapshot isolation level on a database.Therefore, starting from SQL Server 2008 R2, we can use snapshot isolation to access aFILESTREAM data value for read operations even if the row being read is locked (sharedor exclusive). However, write operations are not permitted on the rows that are alreadylocked (shared or exclusive) by another session. If any attempt is made to do so, SQLServer will display the following error:The process cannot access the file specified because it has been opened in anothertransaction.274

Chapter 7: FILESTREAM Database AdministrationSummary of FILESTREAM behavior undertransaction isolation levelsThe following table summarizes the behavior of FILESTREAM operations under varioustransaction isolation levels, when the row being accessed is locked by another process.Row is locked byanother processIsolation levelT-SQLWin32Read Write Read WriteEXCLUSIVE READ COMMITTED No No No NoEXCLUSIVE READ UNCOMMITTED Yes No No NoEXCLUSIVE REPEATABLE READ No No No NoEXCLUSIVE SERIALIZABLE No No No NoEXCLUSIVESNAPSHOT(SQLServer 2008 R2or later)Yes No Yes NoSHARED READ COMMITTED Yes No Yes NoSHARED READ UNCOMMITTED Yes No Yes NoSHARED REPEATABLE READ Yes No Yes NoSHARED SERIALIZABLE Yes No Yes NoSHAREDSNAPSHOT(SQLServer 2008 R2or later)Yes No Yes NoTable 7-1:Summary of behavior of FILESTREAM operations with isolation levels.275

Chapter 7: FILESTREAM Database AdministrationDetaching and Attaching FILESTREAMDatabasesIt is common practice to detach and attach database files in order to move a databasefrom one location to another. In a typical detach-attach process, you detach the database,move the MDF and LDF files to the new location, and then perform an attach operation.The only difference for a FILESTREAM-enabled database is that, along with your data(MDF) and log (LDF) files, you also need to move the FILESTREAM data container.SSMS does not fully support attaching FILESTREAM-enabled databasesThe Attach Databases dialog box does not allow you to specify the location of the FILESTREAM datafile. As a result, you cannot attach a FILESTREAM-enabled database using SSMS if the path to theFILESTREAM container is different from the original location.Let's see this in action. Listing 7-3 re-creates our, by now familiar, NorthPole databaseand loads an image into a FILESTREAM column. As usual, change the locations and pathsas required.USE masterGO-- If the sample database exists, drop it and re-create itIF DB_ID('NorthPole') IS NOT NULLDROP DATABASE NorthPoleGOCREATE DATABASE NorthPole ON PRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_fs') LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_log.ldf')GO276

Chapter 7: FILESTREAM Database Administration-- Create a table with a FILESTREAM columnUSE NorthPoleGOCREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL)GO-- INSERT a row with FILESTREAM data and insert the data to the tableINSERT INTO Items( ItemID ,ItemNumber ,ItemDescription ,ItemImage)SELECT NEWID() ,'MS1001' ,'Microsoft Mouse' ,bulkcolumnFROM OPENROWSET(BULK 'C:\ArtOfFS\Demos\Images\MicrosoftMouse.jpg',SINGLE_BLOB) AS xListing 7-3:T-SQL script to create a sample database.Let's assume that this database underpins a successful application and we need to movethe data and log files to new, bigger hardware. To simulate this in our example, we'llmove the primary data file and log file to C:\temp\data, and the FILESTREAM containerto C:\temp\FS. First, however, we need to detach the database, as shown in Listing 7-4(alternatively, you can right-click the database in SSMS Object Explorer and selectTasks | Detach…).277

Chapter 7: FILESTREAM Database Administration-- Detach the databaseUSE masterGOEXEC sp_detach_db @dbname = 'NorthPole'Listing 7-4:Detaching a FILESTREAM-enabled database.Next, locate the primary data file, log file and FILESTREAM data container, and movethem to their new locations. Now we're ready to reattach the database to our SQL Serverinstance. This could be the same instance, a different instance on the same computer, ora different instance on a different computer. Note, though, that if the database is beingattached to a different SQL Server instance, we must ensure that FILESTREAM is enabledon the target server (see Appendix A) and that the target SQL Server instance should berunning the same, or a later, version of SQL Server. You can detach a SQL Server 2008and attach it to a SQL Server 2012 instance, but you cannot detach a SQL Server 2012database and attach it to a SQL Server 2008/R2 instance.Once these files have been moved to the new location, we are ready to attach the databaseas shown in Listing 7-5.-- Attach the FILESTREAM databaseCREATE DATABASE NorthPole ONPRIMARY (NAME = NorthPole,FILENAME = 'C:\Temp\Data\NorthPole.mdf'),FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\Temp\FS\NorthPole_fs')LOG ON (NAME = NorthPole_log,FILENAME = 'C:\Temp\Data\NorthPole_log.ldf')FOR ATTACHGOListing 7-5:Attaching a FILESTREAM-enabled database.To verify success, connect to the NorthPole database and fire off a simple query toretrieve the data from the items table.278

Chapter 7: FILESTREAM Database AdministrationFILESTREAM and Garbage FilesSQL Server does not support in-place updates of the disk file storing the BLOB data for aFILESTREAM column. A disk file is created for every non-NULL FILESTREAM data valuestored in every FILESTREAM column of every table in the database. When you updatethe BLOB value stored in a FILESTREAM cell, the associated disk file also needs to beupdated. What SQL Server does internally is to create a new file with the modified value.The old version of the file will lie around in the folder until the file is no longer needed.Unwanted files are later deleted by an asynchronous garbage collection process that runsin the background. An old version of a FILESTREAM data file is available for garbagecollection when a database CHECKPOINT occurs after the transaction log is truncated pastthe LSN (Log Sequence Number) which generated the new file.For this reason, FILESTREAM storage is not ideal for cases where the BLOB data storedin the FILESTREAM column is frequently updated. Every time the FILESTREAM valueis updated (from a non-NULL value to another non-NULL value) SQL Server will haveto create a new file with the modified value. If the frequency of such update operationsis high, you might experience severe performance problems. In such a case, storing theBLOB data in the file system and storing the path to the disk file in a table might be moreefficient than using a FILESTREAM column.Garbage files and recovery modelsUnder the SIMPLE recovery model, the FILESTREAM garbage collector is triggered whena database CHECKPOINT occurs. You can issue a CHECKPOINT to trigger the garbagecollection thread to clean up the unwanted files in the FILESTREAM data container.Under the FULL and BULK_LOGGED recovery models, you might notice that issuing aCHECKPOINT does not clean up the unwanted files in the FILESTREAM data container.A CHECKPOINT may still run the garbage collector, but it may not remove the file as youwould normally expect. This is because the transaction log virtual log file (VLF) that279

Chapter 7: FILESTREAM Database Administrationcontains the record is still active, and will remain so until the log is backed up. EveryFILESTREAM operation is mapped to a transaction LSN in the transaction log. UnderFULL and BULK_LOGGED recovery models, the garbage collector will remove a file onlyafter the transaction log VLF that contains the LSN associated with the FILESTREAMoperation is not part of the active log.FILESTREAM garbage collection and tombstonetablesAs we have seen, SQL Server does not support in-place updates of FILESTREAM data. Ifwe modify a BLOB value stored in a FILESTREAM column, SQL Server will create a newdisk file with the modified value. The old file will stay around for some time until it is "nolonger needed" and the garbage collector steps in and deletes it.It is interesting to note how SQL Server keeps track of the garbage files. If we modify afew FILESTREAM data values and look into the FILESTREAM data folder, we'll find manymore files than the number of rows in the table. So the question is, "Which are garbagefiles and which are not?" SQL Server creates an internal table called the FILESTREAMtombstone table to hold the details of garbage files to be removed. The garbage collectorqueries this table to identify the garbage files and removes them when those files are nolonger needed.To see this in action, run the script in Listing 7-6 to create a sample database containing atable with two FILESTREAM columns.USE masterGOIF DB_ID('NorthPole') IS NOT NULLDROP DATABASE NorthPoleGO280

Chapter 7: FILESTREAM Database AdministrationCREATE DATABASE NorthPole ON PRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_fs') LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_log.ldf')GOUSE NorthPoleGOCREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage1] VARBINARY(MAX) FILESTREAMNULL ,[ItemImage2] VARBINARY(MAX) FILESTREAMNULL)Listing 7-6:Creating an example database with two FILESTREAM columns.Now we'll insert a row into the FILESTREAM table (Listing 7-7).INSERT INTO Items( ItemID ,ItemNumber ,ItemDescription ,ItemImage1 ,ItemImage2)SELECT NEWID() ,'MS1001' ,'Microsoft Mouse' ,0x1 ,0x1Listing 7-7:Inserting a row into the table.281

Chapter 7: FILESTREAM Database AdministrationRun the code in Listing 7-8 to update the FILESTREAM columns. The first column isupdated to NULL and SQL Server discards the FILESTREAM data file associated withthe field. The second column is updated with a new value, and SQL Server creates a newversion of the file containing the new value, and discards the previous file.UPDATE ItemsSET ItemImage1 = NULL ,ItemImage2 = 0x2Listing 7-8:Updating the columns to create garbage files.After the UPDATE operation, you will see three files in the FILESTREAM data container.Out of the three, two are no longer needed. Those two files will be registered in theFILESTREAM tombstone table. We can try to read the names of those files from thattable, but first we need to find out its name, as shown in Listing 7-9.SELECTnameFROM sys.internal_tablesWHERE name LIKE 'FILESTREAM_tomb%'/*name--------------------------------FILESTREAM_tombstone_2073058421*/Listing 7-9:Retrieving the name of the FILESTREAM tombstone table.However, the tombstone table cannot be accessed from the user mode queries. Toread information from this table, we need to connect to the server using DedicatedAdministrator Connection (DAC).282

Chapter 7: FILESTREAM Database AdministrationTo make a DAC via SSMS:• from the File menu, select New, Database Engine Query• in the connection dialog box, enter ADMIN:your-server-name; for example, if yourserver name is Server1, enter ADMIN:Server1 (Figure 9-1)• enter the credentials• write your queries in the new query window.Figure 7-1:Making a DAC using SSMS.Please note that DAC is for diagnostic/debugging purposes and should not be misused.Also note that only one DAC connection is possible at any given time, which meansthat the Object Explorer window cannot display the tree view of the database to whichyou are connected, and this is why you will get an error when trying to connect usingDAC from the New Query button, or the Connect to Database Engine option of theObject Explorer.283

Chapter 7: FILESTREAM Database AdministrationConnect to your instance using DAC and run the query in Listing 7-10, which will retrievethe list of garbage files in the current database. If you have followed the example, thequery will return details of two garbage files.SELECTphysical_name AS FILESTREAMFolder,rowset_guid AS TableFolder,column_guid AS ColumnFolder,FILESTREAM_value_name AS FILESTREAMFileFROM sys.filestream_tombstone_2073058421 ftINNER JOIN sys.database_files dfON df.file_id = ft.file_id/*FILESTREAMFolder TableFolder ColumnFolder FILESTREAMFile----------------------- ------------- ------------- ---------------C:\Demos\FS\NorthPoleFS 448BFCA8-6... 91DFDC80-E... 00000018-000...C:\Demos\FS\NorthPoleFS 448BFCA8-6... 2A7202BA-4... 00000018-000...*/Listing 7-10:Retrieving the list of garbage files.FILESTREAM Data Corruption and DBCC ChecksOne of the most undesirable situations for a database administrator is databasecorruption. In most cases, regular database checks and maintenance activities will help adatabase administrator to ensure that the databases are healthy and free from corruption.A common cause of a corrupted FILESTREAM database is an "over-enthusiastic" administrativeuser who wants to play with the FILESTREAM data files directly. Remember thatall access to the FILESTREAM data should be made through SQL Server. Direct access tothe FILESTREAM data using the file system tools may corrupt your database.284

Chapter 7: FILESTREAM Database AdministrationThe following sections examine a few common causes of a corrupt FILESTREAMdatabase, and how to use DBCC CHECKDB or DBCC CHECKTABLE to check the consistencyof your database along with the FILESTREAM data, and try to fix any corruption issues.Corruption caused by missing FILESTREAM datafilesThe recommended way to delete a FILESTREAM data file is by setting the value of thecolumn to NULL by performing a T-SQL UPDATE operation. If a user or software processdirectly deletes a file from the FILESTREAM data container, you will end up with acorrupted database. Let's replicate this scenario here (we're about to deliberately corrupt adatabase, so only do this on a test database on a test server).Run the script in Listing 7-11 to create a new FILESTREAM-enabled database with aFILESTREAM column and two rows.USE masterGOIF DB_ID('NorthPole') IS NOT NULLDROP DATABASE NorthPoleGOCREATE DATABASE NorthPole ON PRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_fs') LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_log.ldf')GOUSE NorthPoleGO285

Chapter 7: FILESTREAM Database AdministrationCREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL)GOINSERT INTO Items( ItemID, ItemNumber, ItemDescription, ItemImage )VALUES ( NEWID(), 'MS1001', 'Microsoft Mouse', 0x1 ),( NEWID(), 'MS1002', 'Microsoft Keyboard', 0x2 )Listing 7-11:Creating a FILESTREAM database with a FILESTREAM column and two rows.Navigate to the FILESTREAM data container using Windows Explorer and you will seethe two FILESTREAM data files in the folder associated with the Items table. Go aheadand delete one of those files.Let's fire off a few queries to confirm that the database has indeed been corrupted.Listing 7-12 reads data from the Items table.SELECTitemNumber,ItemDescriptionFROM Items/*itemnumberItemDescription-------------------- -------------------MS1001Microsoft MouseMS1002Microsoft Keyboard*/Listing 7-12:Reading relational data from the corrupted Items table.286

Chapter 7: FILESTREAM Database AdministrationThat worked… but we did not touch the FILESTREAM data in the query. Listing 7-13, onthe other hand, will try to read from the FILESTREAM columns.SELECTitemnumber,itemImageFROM Items/*itemnumberItemImage-------------------- ------------------------------Msg 233, Level 20, State 0, Line 0A transport-level error has occurred when receivingresults from the server. (provider: Shared MemoryProvider, error: 0 - No process is on the otherend of the pipe.)*/Listing 7-13:Attempting to read FILESTREAM data from the corrupted Items table.There is a fatal error with Level 20. In SQL Server 2008 and R2 the connection will beterminated when such an error occurs. However, in SQL Server 2012 this error does notterminate the connection.Note that the error occurs only if you try to access a FILESTREAM value that is having aproblem (in this case, missing from the FILESTREAM data container). Querying a validFILESTREAM value will work correctly, as shown in Listing 7-14.SELECTitemNumber,DATALENGTH(ItemImage)FROM itemsWHERE itemNumber = 'MS1002'/*itemnumber-------------------- --------------------MS1002 4*/Listing 7-14:Reading uncorrupted FILESTREAM data in a corrupted database.287

Chapter 7: FILESTREAM Database AdministrationChecking for database consistencyConsistency of the FILESTREAM data can be checked using DBCC CHECKDB for the entiredatabase, or DBCC CHECKTABLE to validate a single table. In Listing 7-15, we run DBCCCHECKTABLE to scan the table that we just corrupted by deleting the FILESTREAM datafile using Windows Explorer.DBCC CHECKTABLE('Items')/*DBCC results for 'sys.FILESTREAM_tombstone_2073058421'.There are 0 rows in 1 pages for object"sys.FILESTREAM_tombstone_2073058421".DBCC results for 'Items'.Msg 7904, Level 16, State 2, Line 1Table error: Cannot find the FILESTREAM file"00000015-0000005e-0004" for column ID 4 (column directoryID 4d8cb114-f573-490e-adfa-e4583889b836) in object ID 2105058535,index ID 0, partition ID 72057594038779904, page ID (1:157),slot ID 0.There are 2 rows in 1 pages for object "Items".CHECKTABLE found 0 allocation errors and 1 consistency errorsin table 'Items' (object ID 2105058535).repair_allow_data_loss is the minimum repair level for theerrors found by DBCC CHECKTABLE (NorthPole.dbo.Items).DBCC execution completed. If DBCC printed error messages, contactyour system administrator.*/Listing 7-15:Scanning the corrupted table.As expected, the DBCC CHECKTABLE command scanned the entire table and detected thata FILESTREAM data file is missing from the FILESTREAM data container.288

Fixing corrupted dataChapter 7: FILESTREAM Database AdministrationOne of the key features offered by FILESTREAM is consistency between FILESTREAMdata and relational data. We just deleted a FILESTREAM data file directly (outside SQLServer) and so lost the consistency of data. The only way to fix this is either by deletingthe row that contains the FILESTREAM data file (recommended), or by adding the lost fileback to the FILESTREAM data container.If you do not want to lose the relational data, you could fix the corruption by copyingback the correct file directly into the FILESTREAM data container. This requires that youhave the ability to locate a backup copy somewhere, possibly by restoring from a backup.SQL Server allows you to query the relational data even though the FILESTREAM data iscorrupted, which may help you to identify which data value is missing. Note that the filename of the new file must be the same; the file name and location can be identified fromthe DBCC error message.If you cannot locate a copy of the missing or corrupt file, you can put a dummy file withthe expected file name into the FILESTREAM data folder, and this will "fix" the corruptionso that you do not have to delete the relational data (which may be hard to get back). Forexample, if your table stores the avatar of users and you lost the image of one record dueto a mistake, you could put an anonymous image file into the FILESTREAM data folder.To use a dummy file, locate an existing file or create a new file that is of the same type asthe missing file. Copy the dummy file to the FILESTREAM data container and rename itto match the name of the missing file (as reported by DBCC CHECKDB). Having done so,rerun DBCC CHECKDB or DBCC CHECKTABLE, and you will see that the corruption is fixed.If you can tolerate some degree of data loss, then the alternative solution is to deletethe row from the table to make the data consistent; we can do this by running DBCCCHECKDB with repair_allow_data_loss, which is about as scary as it sounds; we'reabout to lose some data! In this case, it will delete the associated row(s) from the table andrestore the consistency of the data.289

Chapter 7: FILESTREAM Database AdministrationTo see this working, go back to the FILESTREAM data folder and delete the file we justcreated. Run DBCC CHECKTABLE again, and you will see that the error is reported again.Run the script in Listing 7-16 to fix the corruption by deleting the records that havecorrupted FILESTREAM values. The first action the script performs is to take thedatabase into single user mode. The database needs to be in single user mode to performa repair operation.ALTER DATABASE NorthPole SET SINGLE_USERWITH ROLLBACK IMMEDIATEGODBCC CHECKDB('NorthPole', repair_allow_data_loss)/*Msg 7904, Level 16, State 2, Line 2Table error: Cannot find the FILESTREAM file "00000015-0000005e-0004"for column ID 4 (column directory ID 4d8cb114-f573-490e-adfa-e4583889b836) inobject ID 2105058535, index ID 0, partition ID 72057594038779904, page ID (1:157),slot ID 0.The error has been repaired.Msg 8945, Level 16, State 1, Line 2Table error: Object ID 2105058535, index ID 2 will be rebuilt.The error has been repaired.*/Listing 7-16:Fixing FILESTREAM corruption by deleting the affected rows.If you run a query on the Items table, you will see that it now has only one row; the rowfrom which the FILESTREAM data was missing has been deleted as part of the repairprocess. Again, keep in mind that this approach may or may not be acceptable, dependingupon the nature of your application and sensitivity of the data.290

Chapter 7: FILESTREAM Database AdministrationCorruption caused by orphaned filesCreating or copying unwanted files in the FILESTREAM data container can also causecorruption in a database.Let us see what happens when an unwanted file is present in the FILESTREAM datacontainer. Go to the FILESTREAM data container and create a new file, or copy anexisting file from another folder. For this demonstration, I copied the MicrosoftMouse.jpg into the FILESTREAM data container's folder structure. Then, run DBCC CHECKDB orDBCC CHECKTABLE and you should see an error message similar to the following:Table error: The file "\12f4b92b-6c51-43a5-9a32-71d1b0de0646\MicrosoftMouse.jpg" in therowset directory ID 4443b7d5-904c-475d-bcc8-20bc0e09b887 is not a valid FILESTREAM file.Note that SQL Server complains that there is a file present that is not a validFILESTREAM file. One could probably argue that this is because the file is not correctlynamed. Files in the FILESTREAM data folder do not have extensions, and they are namedfollowing a certain pattern. Go back to your FILESTREAM folder and rename the file sothat it follows the same naming pattern as the other files in the FILESTREAM folder, andthen run DBCC CHECKTABLE again.Table error: The orphaned file "0000001b-0000015c-0005" was found in the FILESTREAM directoryID 12f4b92b-6c51-43a5-9a32-71d1b0de0646 for object ID 21575115, index ID 0, partition ID72057594039042048, column ID 4.Note that the error is different now. This time, SQL Server detected that there is anorphan file in the FILESTREAM directory. Let's see if DBCC CHECKDB can fix this bydeleting the orphaned file; rerun DBCC CHECKDB with repair_allow_data_loss(Listing 7-16).Table error: The orphaned file "0000001b-0000015c-0005" was found in the FILESTREAM directoryID 12f4b92b-6c51-43a5-9a32-71d1b0de0646 for object ID 21575115, index ID 0, partition ID72057594039042048, column ID 4.291

Chapter 7: FILESTREAM Database AdministrationAs you can see, SQL Server cannot fix an inconsistency error of this type. If it findsa foreign file in the FILESTREAM data container, it will return an error, butDBCC CHECKTABLE or DBCC CHECKDB will not remove the unwanted file. You mustremove it yourself to fix the corruption.Corruption can leak into your backup filesNote that, if you take a backup of the database, the backup will also include the orphaned file. If yousubsequently restore the backup, the new database will be corrupted as well.Before moving on, delete the orphaned file from the FILESTREAM data container,then run DBCC CHECKDB again, to make sure that the database is no longer in acorrupted state.Corruption caused by deleting the garbage filesmanuallyAs discussed earlier in the Garbage files and recovery models section, there is sometimes adelay before garbage files are deleted. If you lose patience and delete garbage files yourself,you may create a problem for the garbage collector, as it will try to remove files that nolonger exist.If you have been following through with the example over the previous sections, theItems table in NorthPole database has only one record. Go to the FILESTREAM datacontainer and note down the name of the file. Next, run the script in Listing 7-17 toupdate the FILESTREAM data value. The update operation will generate a new file withthe modified value and the old file will become garbage.292

Chapter 7: FILESTREAM Database AdministrationUPDATE TOP (1) Items SETItemImage = CAST(REPLICATE('A', 2048) AS VARBINARY(MAX))Listing 7-17:Updating a FILESTREAM data value to create a garbage file.After the update, you will see that there are two files in the FILESTREAM data container.A new file has been generated with the modified value, and the previous file (whose nameyou noted) is now garbage.My tests show that deleting this garbage file manually, without waiting for the garbagecollector, does not corrupt the database in present versions of SQL Server. After deletingthe garbage files, I ran DBCC CHECKDB and no error was reported. This was surprising tome, because the garbage files are tracked internally through the FILESTREAM tombstonetable and I would expect SQL Server to report an error if one goes missing.As has been stressed several times already, it is not a good idea to touch the FILESTREAMdata container directly. All access to the FILESTREAM data should be done through SQLServer – either using T-SQL or through the Win32 API calls. Do not ever attempt to cleanFILESTREAM garbage by yourself, unless there is an emergency where you are runningout of storage space and you badly need some of those garbage files to be removed (and,of course, you fully understand the risks of touching FILESTREAM data directly).Querying FILESTREAM DatabasesSQL Server has a number of FILESTREAM-related system views that you can use to queryadditional information about the FILESTREAM storage and activities. They are listed inChapter 9: Investigating FILESTREAM Databases, along with coverage of a few new systemwait types to deal with the various FILESTREAM-related operations and some of thecommon FILESTREAM metadata queries that you might find very helpful while workingwith FILESTREAM-enabled databases.293

Chapter 7: FILESTREAM Database AdministrationFILESTREAM Data and Space ManagementThe FILESTREAM feature enables the storage of large BLOB values in the database,and one of the challenges that database administrators will face is space management.Moving a copy of the production database to a secondary location or to the developmentenvironment will be a real challenge in many cases, so it is very important to plancarefully when you decide to go with a FILESTREAM implementation.FILESTREAM database and backupsAs we will discuss in Chapter 8: Back Up and Restore for FILESTREAM Databases, the backup of aFILESTREAM database will also include the FILESTREAM data, so large backup sizes / long backuptimes are also a potential issue.If the expected workload and use cases are known in advance, it will be easier todecide the number of FILESTREAM filegroups needed, as well as to identify any datapartitioning requirements for handling storage and retrieval. However, there may bemany cases where this information is not known in advance or the actual workload andusage scenarios may overshoot all the estimations made at design time.These scenarios may lead to the requirement to move FILESTREAM data partially or fullyto new disk volumes with the minimum possible downtime. We'll examine a few optionsthat will allow us to do this.294

Chapter 7: FILESTREAM Database AdministrationChanging the FILESTREAM filegroup locationIn this example, let's imagine that our database started small, and we expected theFILESTREAM data to grow to 100 GB in three years. Therefore, we put the FILESTREAMdata file on a 148 GB disk volume. After a year, it's clear that the data is growing fasterthan expected and we have already used over 120 GB on the disk volume and need tomove the FILESTREAM data to a larger disk.Run the T-SQL code provided in Listing 7-18 to rebuild our sample database.USE masterGOIF DB_ID('NorthPole') IS NOT NULLDROP DATABASE NorthPoleGOCREATE DATABASE NorthPole ONPRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_fs')LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_log.ldf')GOListing 7-18:Creating the sample database.This creates a FILESTREAM database with a single FILESTREAM filegroup and sets theFILESTREAM data container to C:\ArtOfFS\Demos\Chapter7\NorthPole_fs. Thenext step is to create a FILESTREAM-enabled table (Listing 7-19).295

Chapter 7: FILESTREAM Database AdministrationUSE NorthPoleGOCREATE TABLE [dbo].[Items]([ItemID] [int] IDENTITY(1,1) NOT NULL,[ItemGuid] [uniqueidentifier] ROWGUIDCOL NOT NULL UNIQUE,[ItemNumber] [varchar](20) NULL,[ItemDescription] [varchar](50) NULL,[ItemImage] [varbinary](max) FILESTREAM NULL,CONSTRAINT PKEY_Items PRIMARY KEY CLUSTERED([ItemID] ASC))Listing 7-19:Creating a FILESTREAM table.This table is a bit different from the previous examples; this time, the Items table has anIDENTITY column called ItemID, which has a PRIMARY KEY defined on it. The next stepis to insert a few rows into the table (Listing 7-20).INSERT INTO Items(ItemGuid, ItemNumber, ItemDescription, ItemImage) VALUES(NEWID(), 'ITM001', 'Item 1', 0x),(NEWID(), 'ITM001', 'Item 1', 0x)Listing 7-20:Populating the table with zero-length files.In this simple example, there will only be two files in the FILESTREAM data container.However, imagine that the database has grown very large and the folder contains a largenumber of FILESTREAM data files, necessitating the move of the FILESTREAM filegroupto a new disk volume. For the purpose of keeping this example simple, let's assume thatthe new location is C:\ArtOfFS\Demos\Chapter7\NorthPole_fs2. Of course, inreality, the new location would be on a separate disk volume.The approach we will take here is to add a new FILESTREAM filegroup at this newlocation, and rebuild the clustered index on the FILESTREAM table, which will move theFILESTREAM data to this new location. Listing 7-21 adds a new FILESTREAM filegroupinto the NorthPole database.296

Chapter 7: FILESTREAM Database AdministrationALTER DATABASE NorthPoleADD FILEGROUP NorthPole_fs2CONTAINS FILESTREAMListing 7-21:Adding a new FILESTREAM filegroup.Next, associate the filegroup with the location to which we want to move theFILESTREAM data (Listing 7-22).ALTER DATABASE NorthPoleADD FILE(NAME= NorthPole_fs2,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_fs2')TO FILEGROUP NorthPole_fs2Listing 7-22:Adding a file to the newly created FILESTREAM filegroup.The next step is to drop the clustered index (which is also the primary key of the table)using the script in Listing 7-23. If there is currently no clustered index on the table, thenmove on to the next step, which is creating a clustered index.ALTER TABLE [dbo].[Items] DROP CONSTRAINT [PKEY_Items]Listing 7-23:Dropping the clustered index on the Items table.Now, in Listing 7-24, we can re-create the clustered index, specifying the location of thenew FILESTREAM filegroup. SQL Server will move the FILESTREAM data to the newlocation while the clustered index is being built.ALTER TABLE [dbo].[Items]ADD CONSTRAINT [PKEY_Items]PRIMARY KEY CLUSTERED ([ItemID] ASC)FILESTREAM_ON [NorthPole_fs2]Listing 7-24:Rebuilding the clustered index to move the FILESTREAM data.297

Chapter 7: FILESTREAM Database AdministrationNotice the usage of the FILESTREAM_ON clause in Listing 7-24. By specifying theFILESTREAM_ON clause, you instruct SQL Server to move the FILESTREAM data to thespecified filegroup.Note that this approach requires attention to an important detail. If, for some reason, theclustered index is rebuilt again, the FILESTREAM_ON clause must be used again, to avoidthe FILESTREAM data container from moving back to the default filegroup.An alternative to moving the data by using the FILESTREAM_ON clause to the ALTERTABLE command, when we add the new clustered index, is to first specify the new targetFILESTREAM filegroup as the default filegroup for the database, and then simply add theclustered index, as shown in Listing 7-25.ALTER DATABASE NorthPole MODIFY FILEGROUP NorthPole_fs2 DEFAULTGOALTER TABLE [dbo].[Items]ADD CONSTRAINT [PKEY_Items]PRIMARY KEY CLUSTERED ([ItemID] ASC)Listing 7-25:Setting the default FILESTREAM filegroup to move the data.Whichever technique you use, once you have rebuilt the clustered index, look in the newFILESTREAM data location and you will see two FILESTREAM data files. Now look in theoriginal FILESTREAM location and will see two files there, too. The reason for this is thatthe garbage collector has not yet run; the next time it does, those files will be deleted.To verify that new FILESTREAM data is going to the new location, run the code in Listing7-26, which adds two more rows to the FILESTREAM table.INSERT INTO Items(itemguid, itemnumber, itemdescription, ItemImage) VALUES(NEWID(), 'ITM001', 'Item 1', 0x),(NEWID(), 'ITM001', 'Item 1', 0x)Listing 7-26:Adding two more rows to the FILESTREAM table.298

Chapter 7: FILESTREAM Database AdministrationThe new FILESTREAM data location contains two new files, which are not present inthe original location. If you no longer need the original filegroup, you can remove it, byfirst removing the FILESTREAM data container and then the filegroup itself. Listing 7-27shows how.ALTER DATABASE NorthPoleREMOVE FILE NorthPole_fs ;ALTER DATABASE NorthPoleREMOVE FILEGROUP NorthPole_fs ;Listing 7-27:Removing the FILESTREAM data container and filegroup.Note that the demonstrated behavior has an important consequence: because onlyone FILESTREAM filegroup can be marked as the default for the database, if you areintentionally spreading the FILESTREAM data over multiple filegroups (as discussednext), then creating or recreating a clustered index on a table with FILESTREAMdata stored in a non-default filegroup will cause the data to be moved, which maybe undesired.Adding a new partition to share the FILESTREAMstorage loadWe can partition a FILESTREAM table so as to distribute the FILESTREAM storage tomultiple disk volumes. By designing the table in this way, it is quite easy to add a newpartition to a FILESTREAM table, in order to add more storage space, or balance the I/Oload across multiple storage systems.In this example, we'll create a partitioned FILESTREAM table and then add a newpartition to move some of the FILESTREAM data to a new filegroup. First, run the scriptin Listing 7-28 to re-create the sample database.299

Chapter 7: FILESTREAM Database AdministrationUSE masterGOIF DB_ID('NorthPole') IS NOT NULLDROP DATABASE NorthPoleGOCREATE DATABASE NorthPole ONPRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole.mdf'),FILEGROUP NorthPole2(NAME = NorthPole2,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole2.ndf'),FILEGROUP NorthPole_fs1 CONTAINS FILESTREAM DEFAULT(NAME = NorthPole_fs1,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_fs1'),FILEGROUP NorthPole_fs2 CONTAINS FILESTREAM(NAME = NorthPole_fs2,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_fs2')LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_log.ldf')GOListing 7-28:Creating a FILESTREAM database with multiple FILESTREAM filegroups.This database contains a primary filegroup (NorthPole) containing the primary datafile, a filegroup called NorthPole2, containing a secondary data file, two FILESTREAMfilegroups, NorthPole_fs1 and NorthPole_fs2, and the log file. Purely for convenience,all of these are stored on the same disk volume but, in a real-world scenario, you'dprobably put all these files on different disk volumes.Next, we're going to create two partitions to store the FILESTREAM data. Listing 7-29shows the partition function and partition schemes to do this, assuming we're going topartition the table by the ID column. Take a look at the code and then we'll discuss howit works.300

Chapter 7: FILESTREAM Database AdministrationUSE NorthPoleGOCREATE PARTITION FUNCTION NPPartFN (INT) ASRANGE RIGHT FOR VALUES (4)CREATE PARTITION SCHEME NPPartDBSch ASPARTITION NPPartFN TO([PRIMARY],[NorthPole2])GOCREATE PARTITION SCHEME NPPartFSSch ASPARTITION NPPartFN TO(NorthPole_fs1, NorthPole_fs2)GOListing 7-29:Creating a partition function and scheme.First, this code creates a partition function, NPPartFN, which will split data across twofilegroups; relational data will be split across the primary and secondary data files, and theFILESTREAM data will be divided across the two FILESTREAM filegroups.We will associate this function with the ItemID column of the Items table, such thatrows with ItemID less than 4 will go to the first filegroup (PRIMARY for relational dataand NorthPole_fs1 for FILESTREAM data), and those with 4 and above will go to thesecond filegroup (NorthPole2 for relational data and NorthPole_fs2 for FILESTREAMdata). Of course, these numbers are very small and are used only for demonstrationpurposes. In the real world, you could use a larger value, or you might want to partitionbased on date, or some other value.The second part of the code creates two partition schemes; one for the relational data andthe other for the FILESTREAM data. Note that the partition switching of FILESTREAMdata works only when the RIGHT operator is used in the partition function. It does notwork with LEFT, which is a confirmed bug in the current versions of SQL Server.The next step is to create the Items table, exploiting our new partition scheme, as shownin Listing 7-30.301

Chapter 7: FILESTREAM Database AdministrationUSE NorthPoleGOCREATE TABLE [dbo].[Items]([ItemID] [int] IDENTITY(1,1) NOT NULL,[ItemGuid] [uniqueidentifier] ROWGUIDCOLNOT NULL UNIQUE ON [PRIMARY],[ItemNumber] [varchar](20) NULL,[ItemDescription] [varchar](50) NULL,[ItemImage] [varbinary](max) FILESTREAM NULL) ON NPPartDBSch(ItemID)FILESTREAM_ON NPPartFSSchListing 7-30:Creating the Items table on the partition scheme.Listing 7-31 inserts six rows into the Items table. Based on our partition rules, the firstthree rows should go to the first filegroup and the rest should go to the second filegroup.INSERT INTO Items(itemguid, itemnumber, itemdescription, ItemImage)SELECT NEWID(), 'ITM001', 'Item 1', 0x1UNION ALLSELECT NEWID(), 'ITM002', 'Item 2', 0x2UNION ALLSELECT NEWID(), 'ITM003', 'Item 3', 0x3UNION ALLSELECT NEWID(), 'ITM004', 'Item 4', 0x4UNION ALLSELECT NEWID(), 'ITM005', 'Item 5', 0x5UNION ALLSELECT NEWID(), 'ITM006', 'Item 6', 0x6Listing 7-31:Creating six new rows in the Items table.Go to the FILESTREAM data container and verify that you see three files in the locationassociated with the first filegroup and another three files in the location associated withthe second FILESTREAM filegroup.302

Chapter 7: FILESTREAM Database AdministrationNow, let's assume that this database has been running for years, and we are about torun out of space on the disk volume holding the filegroup NorthPole_fs2. To mimicthis scenario, we'll simply add three more rows into the sample database so that the firstfilegroup still has three FILESTREAM data files and the second filegroup is "overloaded"with six FILESTREAM data files (Listing 7-32).INSERT INTO Items(itemguid, itemnumber, itemdescription, ItemImage)SELECT NEWID(), 'ITM007', 'Item 7', 0x7UNION ALLSELECT NEWID(), 'ITM008', 'Item 8', 0x8UNION ALLSELECT NEWID(), 'ITM009', 'Item 9', 0x9Listing 7-32:Creating three more rows in the Items table.What we want to do is create a new FILESTREAM filegroup (on a separate disk), thencreate a new partition scheme that makes sure all subsequent data will be stored in thisnew filegroup, and finally modify our partition function so that the three rows created inListing 7-32 are moved to the new partition scheme.The first step is to add the new filegroups. It's not technically required to add a new datafilegroup for every new FILESTREAM filegroup. This practice is used here only to makethe code examples clear to follow. Thus, we will create two new filegroups, one for therelational data and the other for the FILESTREAM data (Listing 7-33).ALTER DATABASE NorthPoleADD FILEGROUP NorthPole_fs3 CONTAINS FILESTREAMALTER database NorthPoleADD FILE(NAME = 'NorthPole_fs3',FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole_fs3')TO FILEGROUP NorthPole_fs3303

Chapter 7: FILESTREAM Database AdministrationALTER DATABASE NorthPoleADD FILEGROUP NorthPole3ALTER database NorthPoleADD FILE(NAME = 'NorthPole3',FILENAME = 'C:\ArtOfFS\Demos\Chapter7\NorthPole3.mdf')TO FILEGROUP NorthPole3Listing 7-33:Adding two new filegroups to the sample database.Now we'll add the newly created filegroups into the partition scheme and mark them asthe NEXT filegroup to be used (Listing 7-34).ALTER PARTITION SCHEME NPPartFSSchNEXT USED NorthPole_fs3ALTER PARTITION SCHEME NPPartDBSchNEXT USED NorthPole3Listing 7-34:Adding the new filegroups into the partition scheme.We can now reconfigure the partition function to add another split at 7; this will moveall the rows with ItemID 7 out of the NorthPole2 and NorthPole_fs2 filegroups andinto the NorthPole3 and NorthPole_fs3 filegroups.ALTER PARTITION FUNCTION NPPartFN()SPLIT RANGE (7);Listing 7-35:Modifying the partition function.304

Chapter 7: FILESTREAM Database AdministrationTake a look in the FILESTREAM data container. You will see that three FILESTREAM datafiles have been added to the new location. The FILESTREAM data container belongingto the second filegroup still contains six files, but three of them are garbage and will beremoved when the garbage collector runs next time.Throughout this example, we've partitioned the relational data according to the samescheme as the FILESTREAM data. However, it is also possible to partition only theFILESTREAM data, and leave all the relational data in a single database file. Listing 7-29can be modified to use the PRIMARY filegroup twice in the NPPartDBSch partitionscheme, as shown in Listing 7-36.USE NorthPoleGOCREATE PARTITION FUNCTION NPPartFN (INT) ASRANGE RIGHT FOR VALUES (4)CREATE PARTITION SCHEME NPPartDBSch ASPARTITION NPPartFN TO([PRIMARY],[PRIMARY])GOCREATE PARTITION SCHEME NPPartFSSch ASPARTITION NPPartFN TO(NorthPole_fs1, NorthPole_fs2)GOListing 7-36:Creating a partition function and scheme to partition FILESTREAM databut not relational data.Listings 7-33 and 7-34 can be modified similarly. In Listing 7-33, you would not createthe filegroup NorthPole3 or the associated file. In Listing 7-34, you would not use thesecond ALTER PARTITION statement.305

Chapter 7: FILESTREAM Database AdministrationMigrating FILESTREAM DataThe recommended way to move or copy large databases from one location to anotheris to restore a full backup (see Chapter 8). However, in cases where restoring backups isnot allowed, such as in a shared hosting location, or when only selected objects need tobe restored, we can resort to scripting of the schema and data to transport SQL Serverobjects from one location to another. It is possible to script tables with FILESTREAMdata but it is only recommended in cases where the database size is relatively small, andscripting data of FILESTREAM-enabled databases is recommended only if the BLOBvalues are small.SSMS scriptingThe scripting functionality offered by SSMS allows us to create INSERT scripts that canbe executed on another location to re-create the FILESTREAM data. You can do thisby setting the export option Type of data to script to Schema and data or Data only, asshown in Figure 7-2.Figure 7-2:Selecting the types of data to script.306

Chapter 7: FILESTREAM Database AdministrationThe scripting process does not require any FILESTREAM-specific changes; it is the sameas the process for a non-FILESTREAM database. SSMS generates INSERT statementsto populate the target database with the same values as the source table, including theFILESTREAM columns. FILESTREAM columns are exported as a hex string, as shown inListing 7-37.INSERT [dbo].[Items] ([ItemID],[ItemNumber],[ItemDescription],[ItemImage])VALUES (N'f326ed65-5131-4ef7-891a-89fad516b856',N'MS1001', N'Microsoft Mouse',0xFFD8FFE000104A46494600010100000100010000FFDB004300090607080706090807080A0A090B0D160F0D0C0C0D1B14151016201D2222201D1F1F2428342C242631271F1F2D3D2D335373A3A3A232B3F443F384334393A37FFDB0043010A0A0A0.................................................3737373737373737373737FFC00011080067007A030122000)Listing 7-37:INSERT statement for a FILESTREAM column.The Scripting Wizard allows us to generate scripts to be deployed on a different versionof SQL Server. This is one of the areas where the scripting functionality adds value overregular backups, as we cannot, for example, restore a SQL Server 2008 R2 backup onSQL Server 2008.We select the target database in the Advanced Scripting Options screen (Figure 7-3).307

Chapter 7: FILESTREAM Database AdministrationFigure 7-3:Selecting the target SQL Server version for scripting.We can generate scripts of a SQL Server 2008 FILESTREAM database and deploy to a SQLServer 2005 instance. The FILESTREAM columns will be deployed as VARBINARY(MAX)columns on the target database. However, we cannot script a FILESTREAM column fordeployment to SQL Server 2000.It appears that SSMS cannot script FILESTREAM values larger than 2 GB. However, this isunlikely to be a problem because, in the real world, I would not expect anyone to attemptto generate scripts for BLOB values larger than about 100 MB.308

Chapter 7: FILESTREAM Database AdministrationThe Database Publishing WizardUsing the SQL Server Database Publishing wizard, we can deploy database scripts anddata to another local or remote server. The Database Publishing Wizard (versions 1.3and 1.4) is integrated with Visual Studio, and allows us to publish directly from theVisual Studio IDE (Figure 7-4).Figure 7.4:The Database Publishing Wizard.The functionality offered by the Database Publishing Wizard is very much similar to theDatabase Scripting Wizard we examined earlier.309

Chapter 7: FILESTREAM Database AdministrationDatabase Publishing WizardThe current version of Database Publishing Wizard can be downloaded from: http://brurl.com/fs23.The Database Publishing Wizard allows us to publish a SQL Server 2008 database to aSQL Server 2008 or 2005 database. When publishing from SQL Server 2008 to 2005,FILESTREAM columns will be created as VARBINARY(MAX) columns. We cannot publisha SQL Server 2008 database with FILESTREAM columns for deployment to a SQL Server2000 instance.SummaryThis chapter walked through quite an array of database administration topics that arerelevant to people managing FILESTREAM databases. Here's a brief summary of whatwe covered.• How to detach and attach FILESTREAM databases and, in the process, move theFILESTREAM files to a new location.• The behavior of FILESTREAM data under the transaction isolation levels.• How the database recovery model of the database affects the way the garbage collectorsteps in and removes the garbage files. We also looked at the FILESTREAM tombstonetable, which keeps track of the garbage files; the garbage collector uses this table toidentify the files to be removed.• Why it is not recommended that you touch the FILESTREAM data files on the NTFSfolder directly, and why all access to the FILESTREAM data should be made throughSQL Server only. Deleting files from the FILESTREAM data container or creating newfiles in the FILESTREAM data container can cause corruption in the database.310

Chapter 7: FILESTREAM Database Administration• How to detect corruption of FILESTREAM data using DBCC CHECKDB or DBCCCHECKTABLE, and how to fix corruption using the repair_allow_data_loss flag,although repairs involve deleting the relational data that was stored in the same row asthe corrupted FILESTREAM data.• Various methods for FILESTREAM database space management, including how tomove FILESTREAM storage location of a table that is not partitioned, and how to adda new partition to a partitioned FILESTREAM table to distribute the FILESTREAMstorage load.• How to script FILESTREAM tables and data using the built-in SSMS Scripting Wizardas well as the Database Publishing Wizard.311

Chapter 8: Backup and Restore forFILESTREAM DatabasesOne of the benefits of the FILESTREAM feature over the traditional methods of storingthe BLOB data in the file system is that a FILESTREAM database backup contains boththe relational data and the BLOB data stored in the FILESTREAM data container. Thisremoves the administrative overhead of having to maintain separate backups of thedisk files, which is necessary when using traditional BLOB storage methods. Likewise,when the full backup of a FILESTREAM database is restored to another location, theFILESTREAM data is also restored, and is available along with the relational data.The backup process for a FILESTREAM-enabled database is the same as that for anon-FILESTREAM database; the BACKUP statement is the same, regardless of whetheror not the database being processed has the FILESTREAM feature enabled. The restoreoperation, however, is a little different for a FILESTREAM-enabled database since we needto specify the path where the FILESTREAM filegroups should be restored.In this chapter, we will examine the points below.• How to perform full, differential, and log backups for FILESTREAM databases; this isno different than for any standard database.• How to restore a FILESTREAM database from a full backup, and how to perform apoint-in-time restore.• File backups and piecemeal restores to bring a database back online with only therelational data.• Data and backup compression for FILESTREAM databases.• The issue of garbage files being included in FILESTREAM database backups.312

Chapter 8: Backup and Restore for FILESTREAM DatabasesCreating and Populating the Sample DatabaseAs usual, our preparatory step for the examples is to re-create our NorthPole database,the Items table, with the FILESTREAM column, and populate it with some fresh data.We've performed this step many times now so, without further ado, the script is inListing 8-1. The only additional point to note is that, immediately after creating thedatabase, we ALTER it to stipulate use of the FULL recovery model because we wishto perform transaction log backups for this database. It is not possible to performtransaction log backups for a database operating in SIMPLE recovery model.The script below starts with dropping the database if it already exists. In case you intendto repeatedly execute this script and subsequent exercises provided below, as part of yourtests/experiment, you might want to consider dropping the database from SSMS. Whenyou drop a database from SSMS, it will also remove the backup history from the MSDBdatabase, resulting in a cleaner environment for the new database.USE masterGO-- If the database already exists, drop and re-create itIF DB_ID('NorthPole') IS NOT NULL BEGINDROP DATABASE NorthPoleENDGO-- Create a FILESTREAM-enabled databaseCREATE DATABASE NorthPole ONPRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter8\NorthPole.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter8\NorthPole_fs')LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter8\NorthPole_log.ldf')GO313

Chapter 8: Backup and Restore for FILESTREAM Databases-- Set the database recovery model to fullALTER DATABASE NorthPole SET RECOVERY FULLGO-- Create a table with a FILESTREAM columnUSE NorthPoleGO-- Create the "Items" tableCREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,[ItemNumber] VARCHAR(20),[ItemDescription] VARCHAR(50),[ItemImage] VARBINARY(MAX) FILESTREAM NULL,[CreatedDate] DATETIME DEFAULT GETDATE())GO-- Insert the data into the tableINSERT INTO Items (ItemID, ItemNumber, ItemDescription, ItemImage)SELECTNEWID(), 'MS1001', 'Microsoft Mouse',CAST(bulkcolumn AS VARBINARY(MAX))FROM OPENROWSET(BULK 'C:\ArtOfFS\Demos\Images\MicrosoftMouse.jpg', SINGLE_BLOB) AS xListing 8-1:T-SQL script to create the sample database for the examples in this chapter.Backing up FILESTREAM DatabasesIn this section we'll consider the three most common types of database backup.• Full database backup – Backs up all the data and objects in all the data files associatedwith a database.• Differential database backup – Backs up any data and objects in all the data files for agiven database that have changed since the last full backup.314

Chapter 8: Backup and Restore for FILESTREAM Databases• Transaction log backup – Copies into a backup file all the log records inserted into thetransaction log file since the last transaction log backup.No FILESTREAM-specific commands or keywords are required to create any of thesebackup types.Full backupsA full database backup is the cornerstone of most SQL Server Backup and recoveryschemes, and a full backup of a FILESTREAM database always includes the FILESTREAMdata, as well as all the relational data.Listing 8-2 takes a full backup of the NorthPole database and stores the backup file onthe same drive as the database files. Of course, this is purely for convenience in this demo;in reality, you would always back up to a separate disk drive. Feel free to alter the pathas required.-- Take a full backupUSE masterGOBACKUP DATABASE NorthPoleTO DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole.bak'GOListing 8-2:T-SQL code to take a full backup of a FILESTREAM-enabled database.Differential backupsA differential backup of a SQL Server database contains all changes since the previousfull backup. Differential backups are especially useful when the database is large, andFILESTREAM databases are obviously expected to be large, since a full database backup315

Chapter 8: Backup and Restore for FILESTREAM Databasesprocess for such databases will be a time- and resource-consuming process. A typicalbackup scheme might consist of weekly full backups interspersed with nightly differentialbackups (plus transaction log backups, as required).Again, the behavior of differential backups is the same with FILESTREAM as withnon-FILESTREAM databases. Listing 8-3 inserts a new row into the NorthPole databaseand then captures a differential backup.-- Insert another row into the tableUSE NorthPoleGOINSERT INTO Items (ItemID, ItemNumber, ItemDescription, ItemImage)SELECTNEWID(), 'MS1002', 'Microsoft Mouse 2',CAST(bulkcolumn AS VARBINARY(MAX))FROM OPENROWSET(BULK 'C:\ArtOfFS\Demos\Images\MicrosoftMouse.jpg', SINGLE_BLOB) AS x-- Take a differential backupUSE masterGOBACKUP DATABASE NorthPoleTO DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole_Diff.bak'WITH DIFFERENTIALGOListing 8-3:Capturing a differential database backup.Transaction log backupsA transaction log backup captures every record in the log that has been generated sincethe last log backup. We can combine these log backups with full (and any differential)backups to restore a database to any point in time before the last log backup was taken,assuming we have an unbroken chain of log files, taken after the relevant full backup.316

Chapter 8: Backup and Restore for FILESTREAM DatabasesOnce again, no FILESTREAM-specific additional steps or commands are needed to taketransaction log backups of FILESTREAM databases. Listing 8-3 inserts a third row into theNorthPole database, then captures a transaction log backup.-- Insert another row into the tableUSE NorthPoleGOINSERT INTO Items (ItemID, ItemNumber, ItemDescription, ItemImage)SELECTNEWID(), 'MS1003', 'Microsoft Mouse 3',CAST(bulkcolumn AS VARBINARY(MAX))FROM OPENROWSET(BULK 'C:\ArtOfFS\Demos\Images\MicrosoftMouse.jpg', SINGLE_BLOB) AS x-- Take a log backupUSE masterGOBACKUP LOG NorthPoleTO DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole_Log.trn'GOListing 8-4:Capturing a log backup.Finally, to set the scene for our various restore examples, we're going to add a fourth rowto the table, note the current time (we'll need this later), and then simulate a disaster inthe form of someone accidentally deleting the contents of the Items table. Finally, wetake a second log backup.-- Insert another row into the tableUSE NorthPoleGOINSERT INTO Items (ItemID, ItemNumber, ItemDescription, ItemImage)SELECTNEWID(), 'MS1004', 'Microsoft Mouse 4',CAST(bulkcolumn AS VARBINARY(MAX))317

Chapter 8: Backup and Restore for FILESTREAM DatabasesFROM OPENROWSET(BULK 'C:\ArtOfFS\Demos\Images\MicrosoftMouse.jpg', SINGLE_BLOB) AS xSELECT GETDATE() AS Date/*Date-----------------------2012-03-08 14:22:28.040*/-- Oops!No WHERE clause?!BEGIN TRANDELETE FROM ItemsCOMMIT TRAN-- Take a second log backupUSE masterGOBACKUP LOG NorthPoleTO DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole_Log2.trn'GOListing 8-5:Fourth insert, accidental deletion from Items table, second log backup.Restoring FILESTREAM DatabasesRestoring a FILESTREAM database may require a little more attention than the backupprocess because, while restoring, we need to specify the location of the FILESTREAMfilegroups in addition to the data and log files, if restoring to a new database. Whenperforming an in-place restore, no additional parameters are required. In addition,FILESTREAM must be enabled on the SQL Server instance to which the database isbeing restored.318

Chapter 8: Backup and Restore for FILESTREAM DatabasesRestoring from a full backupThe simplest RESTORE operation involves just a single full backup file. This type ofrestore is relatively common when, for example, refreshing a development environmentwith the latest version of a production database.Currently, following our "accidental" deletion, the Items table is empty. Listing 8-5restores our full database backup over the top of the existing NorthPole database andrecovers the database.USE [master]GORESTORE DATABASE NorthPole FROMDISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole.bak'WITH REPLACE, RECOVERYGOListing 8-6:Restoring a full database backup over the existing database.We have restored our NorthPole database, but so far only recovered one of the missingrows in our Items table. Fortunately, with the other backup files we've captured, we canrepeat the restore operation and this time roll the database forward to recover the otherthree missing rows.Point-in-time restoreOur current NorthPole database contains an Items table with just a single row of data,and our goal here is to restore the database to the point where it contained all four rowsof data.319

Chapter 8: Backup and Restore for FILESTREAM DatabasesOur complete backup scheme for the NorthPole database consists of a full backup, adifferential backup and two transaction log backups. With these backup files, we couldperform another restore operation to completely restore the NorthPole database;in other words, we could restore each backup file in sequence, rolling the databaseforward to the state it existed in at the end of the second log backup, and then recoverthe database. In fact, in this very simple case, we can perform a complete restore simplyusing the full backup and the two transaction log backups; remember that a log backupcaptures all contents in the log file since the last log backup and so, in this case, the firsttransaction log backup will contain the log records for both the rows added after the fullbackup, and the second log backup will contain the details of the final insert. In a morerealistic backup scheme, there would likely have been transaction log backups taken atregular intervals in the time between the full backup and the differential backup. In sucha case, the differential backup would have allowed us to avoid having to restore thoselog backups, and our restore operation would instead have used just the full backup, thedifferential backup, and any subsequent log backups.In any event, the problem we have here is that a complete restore would return thedatabase to the state where the contents of the Items table were deleted since, when werestore the second log backup, it would roll forward, not only the fourth insert, but alsothe transaction that deleted the contents of Items. What we want to do instead, to getall the contents of our Items table back, is restore to a specific point in time within thatsecond log backup; namely, the time that we recorded after the fourth insert, but beforethe accidental delete.Fortunately, SQL Server supports point-in-time recovery of FILESTREAM databases;when you restore a FILESTREAM-enabled database from a certain point in time, theFILESTREAM data will be consistent with the relational data in the restored database.Let's look at an example that performs a point-in-time restore of a FILESTREAMdatabase. This time, instead of restoring over the top of the existing NorthPoledatabase, we're going to restore to a new database, called NothPole2. At the end of thisoperation what we should have is a NorthPole database, with one row in the Itemstable, and a NorthPole2 database with all four rows back in the Items table.320

Chapter 8: Backup and Restore for FILESTREAM DatabasesThe first step is to restore the full backup to create the NorthPole2 database, as shownin Listing 8-7. Notice the use of the NORECOVERY option; this instructs SQL Server toleave the database in a restoring state, ready to receive more backup files.-- Restore the full backupRESTORE DATABASE NorthPole2FROM DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole.bak'WITH NORECOVERY,MOVE 'NorthPole' TO 'C:\ArtOfFS\Demos\Chapter8\NorthPole2.mdf',MOVE 'NorthPole_log' TO 'C:\ArtOfFS\Demos\Chapter8\NorthPole2_log.ldf',MOVE 'NorthPole_fs' TO 'C:\ArtOfFS\Demos\Chapter8\NorthPole2_fs',GOListing 8-7:Restoring a full backup.The code uses the MOVE option three times, to specify the location of the data file,log file, and FILESTREAM data container, respectively. Note that the leaf-level folder(NorthPole2_fs in this example) should not exist already, but the containing folder(C:\ArtOfFS\Demos\Chapter8) must exist! SQL Server reserves the "exclusive" rightsto create and configure NorthPole2_fs folder as part of the restore operation.The next step is to completely restore the first transaction log backup, again leaving theNorthPole2 database in a restoring state.RESTORE LOG NorthPole2FROM DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole_log.trn'WITH NORECOVERYListing 8-8:Restoring the transaction log backup.The final step is the interesting one; we apply the second log backup, but stop the restoreat the point just before the data was accidentally deleted, and then recover the database.If you are following this example on your own system, insert the date and time fromListing 8-9.321

Chapter 8: Backup and Restore for FILESTREAM DatabasesRESTORE LOG NorthPole2FROM DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole_log2.trn'WITH RECOVERY,-- Replace the date and time with the date and time you noted earlierSTOPAT = '2012-03-08 14:22:28.040'Listing 8-9:Restoring the transaction log to a specified point in time.Now, if we run a SELECT query on the Items table in the restored NorthPole2database, you will see that there are four rows. If you run the same query on the originalNorthPole database, you will see that the table contains one row (or no rows if youdidn't run Listing 8-6).You can verify that the FILESTREAM data is restored correctly to the point prior to the"wrong operation," by doing one of the following:• run a SELECT query on the Items table and look at the ItemImage column; you willsee that the column contains BLOB values which reference the FILESTREAM data• go to Windows Explorer and look at the FILESTREAM data container; you will noticethat there are four files in the folder.Point-in-time recovery for FILESTREAM-enabled databases gives database administratorsone more reason for moving to FILESTREAM storage from the traditionalstorage methods.File Backups and Piecemeal RestoresGenerally speaking, the fact that a database backup includes the FILESTREAM data files isa bonus; it reduces the administrative overhead for DBAs, as they no longer need to worryabout separately backing up disk files.322

Chapter 8: Backup and Restore for FILESTREAM DatabasesHowever, there may be times when a DBA does not want the FILESTREAM data to bepart of the regular database backups. For example, you may want to grab a most recentcopy of the production database and send it to the development/test team. A productiondatabase with FILESTREAM data could be quite large, and you might only need therelational data for your development or test environment.It is possible to create a full file backup of a FILESTREAM database to exclude theFILESTREAM filegroups. This is helpful when you do not care about the FILESTREAMdata and only want to deal with the relational data. When you restore such a backupusing the Enterprise edition of SQL Server, you will be able to query all the relationaldata. However, any query that tries to access the FILESTREAM data will obviously fail.Let's go ahead and run some scripts to see this in action. First, rerun the script in Listing8-1 to re-create the sample database, and then run the script provided in Listing 8-10 totake a backup of the PRIMARY filegroup (excluding the FILESTREAM filegroup).-- Take a backupBACKUP DATABASE NorthPoleFILEGROUP = 'PRIMARY'TO DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPolePrimary.bak'GOListing 8-10:Creating a file backup that excludes the FILESTREAM filegroup.Now let's go ahead and create a new database from this partial backup.-- RestoreRESTORE DATABASE NorthPolePartialFROM DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPolePrimary.bak'WITH RECOVERY,PARTIAL,MOVE 'NorthPole' TO 'C:\ArtOfFS\Demos\Chapter8\NorthPolePartial.mdf',MOVE 'NorthPole_log' TO 'C:\ArtOfFS\Demos\Chapter8\NorthPolePartial_log.ldf',GOListing 8-11:Restoring a partial backup that excludes FILESTREAM filegroups.323

Chapter 8: Backup and Restore for FILESTREAM DatabasesThe script in Listing 8-11 restores the file backup that we created earlier. Note the usageof the PARTIAL option in the restore process. When the script has successfully executed,the new database will be online and you will be able to query all the data that is availablein the primary filegroup.Any filegroups that were not part of the restore will be piecemeal restored and, in thecase below, the FILESTREAM filegroup will be marked as RECOVERY_PENDING, as shownfrom the query against the system catalog view sys.database_files in Listing 8-12.USE NorthPolePartialGOSELECTtype_desc,name,physical_name,state_descFROM sys.database_files/*type_desc name physical_name state_desc---------- ------------ ------------------------------ ---------------ROWS NorthPoleDB C:\ArtO..\NorthPolePartial.mdf ONLINELOG NorthPoleLOG C:\..\NorthPolePartial_log.ldf ONLINEFILESTREAM NorthPoleFS C:\ArtOfFS\Dem...\NorthPole_fs RECOVERY_PENDING*/Listing 8-12:Retrieving the list of filegroups associated with a database.The FILESTREAM filegroup will become a "ghost" and, unless you have a file backup ofthe FILESTREAM filegroup and associated transaction logs, there is no way to get rid of it.The data stored in such ghost filegroups will not be available to the database.In this example, all the data in the Items table except the FILESTREAM columns will beavailable for reading and writing. However, if we attempt to access the FILESTREAM data,SQL Server will generate an error. Therefore, the SELECT * on the Items table will failbecause this operation attempts to read the FILESTREAM column along with all the othercolumns, as illustrated in Listing 8-13.324

Chapter 8: Backup and Restore for FILESTREAM DatabasesUSE NorthPolePartialGOSELECT *FROM Items/*Msg 670, Level 16, State 1, Line 2Large object (LOB) data for table "dbo.Items" resides on anoffline filegroup ("NorthPoleFS") that cannot be accessed.*/Listing 8-13:A SELECT * query on a FILESTREAM table will fail if the FILESTREAMfilegroup is not available.If only the relational data is selected, the query will be successful. If a partial restoresequence excludes any FILESTREAM filegroup, point-in-time restore is not supported.For more information see Piecemeal Restores (SQL Server) at http://brurl.com/fs24.Data and Backup Compression for FILESTREAMDatabasesOne of the primary concerns with regard to having a database storing large amounts ofBLOB data is the storage challenge, since FILESTREAM-enabled databases are expectedto store huge amounts of data.SQL Server 2008 introduced a data compression feature with compression of relationaldata at the page or row level. However, it does not compress the FILESTREAM dataassociated with the relational data.It is possible to mark a FILESTREAM data container as "compressed" at the Windowslevel, which ultimately compresses the FILESTREAM data stored in the folder. However,this option is beneficial only if the data in the FILESTREAM data container is furthercompressible. If the data is already compressed (such as a zip file, jpeg, etc.) enabling325

Chapter 8: Backup and Restore for FILESTREAM Databasescompression at the NTFS level will incur unwanted CPU overhead. Though the earlyFILESTREAM documentation such as the FILESTREAM white paper at http://brurl.com/fs7 does not explicitly discourage using NTFS compression on FILESTREAMdata containers, this blog post from the Microsoft Customer Service and Support teamcounsels against it (http://brurl.com/fs8).However, it is possible to take a compressed backup of a FILESTREAM-enabled database.When taking a compressed backup, FILESTREAM data will also be compressed and,depending upon how compressible the FILESTREAM data is, you will notice a differencein the backup size.Beware of Garbage Files in FILESTREAM BackupsIn my experience, it's possible that a full backup of a FILESTREAM-enabled databasewill also include the garbage files present in the FILESTREAM data container at the timeof taking the backup. SQL Server does not filter out the garbage files when creatingthe backup file. Even if the transaction that generated the garbage file is committed,the garbage files will still be part of the backup until they are removed by the garbagecollector thread.The following example demonstrates this behavior. Begin by recreating our sampledatabase using the script provided in Listing 8-1. Once the database is created, run thescript in Listing 8-14, which updates the FILESTREAM value 100 times to generate 100garbage files.-- Update the data 100 timesDECLARE @i INT = 1WHILE @i

Chapter 8: Backup and Restore for FILESTREAM DatabasesEach iteration of the loop updates the value stored in the ItemImage column ofthe Items table, and each UPDATE operation generates a new garbage file. After theexecution completes, you can see the garbage files by looking into the FILESTREAM datacontainer associated with the ItemImage column. You will see 101 files in the folder, ofwhich 100 are garbage files.We have seen how to locate the FILESTREAM data container associated with a database/table/column in the earlier chapters. If you still find it difficult, you can find a number ofuseful queries in Chapter 9: Investigating FILESTREAM Databases.We now take a full backup of the NorthPole database by running the script shown inListing 8-15. We are using the WITH INIT option to overwrite any previous backup sets inthe backup file.-- Take a backupBACKUP DATABASE NorthPoleTO DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole.bak'WITH INITGOListing 8-15:Generating a full backup file.This full backup will include all the 100 garbage files we generated in the previous step.To verify this, we can restore the backup to create a new database, NorthPole3, and putthe FILESTREAM data into C:\ArtOfFS\Demos\Chapter8\NorthPole3_fs (or anyother folder that you prefer).-- RestoreRESTORE DATABASE NorthPole3FROM DISK = 'C:\ArtOfFS\Demos\Chapter8\bak\NorthPole.bak'WITH RECOVERY,MOVE 'NorthPole' TO 'C:\ArtOfFS\Demos\Chapter8\NorthPole3.mdf',MOVE 'NorthPole_log' TO 'C:\ArtOfFS\Demos\Chapter8\NorthPole3_log.ldf',MOVE 'NorthPole_fs' TO 'C:\ArtOfFS\Demos\Chapter8\NorthPole3_fs',FILE=1GOListing 8-16:Restoring a FILESTREAM database.327

Chapter 8: Backup and Restore for FILESTREAM DatabasesWhen the backup has been restored, go to the FILESTREAM data container and youwill see 101 files instead of just the 1 that you would normally expect to see. It is quitesurprising to see that all the garbage files came along with the backup, and they all gotrestored in the new location.The garbage files unnecessarily increase the backup time and size, and increase the diskspace requirements on the target location. My tests showed that a transaction log backupfollowed by a full backup on the target database clears the garbage files the next time thegarbage collector runs.SummaryWe have seen that taking a backup of a FILESTREAM database is no different frombacking up a non-FILESTREAM database. However, a restore operation may need a littlebit more attention when restoring to a different database because you have to specify thelocation to which the FILESTREAM data is to be restored. In addition, you need to makesure that the FILESTREAM feature is enabled on the target database. SQL Server supportspoint-in-time recovery of FILESTREAM-enabled databases and this chapter presented anexample to demonstrate this feature.SQL Server allows the creation of full file backups of FILESTREAM databases that do notinclude the FILESTREAM filegroups. When such a backup is restored, all the relationaldata will be accessible. If you attempt to access the FILESTREAM data, SQL Server willgenerate an error.The backup of a FILESTREAM database includes the garbage and orphaned files that arepresent in the FILESTREAM data container at the time of the backup. These files will berestored on the target location when the database is restored.328

Chapter 9: Investigating FILESTREAMDatabasesIn the previous two chapters, we examined FILESTREAM database administration insome detail. To complete the discussion, we examine here our various options forinvestigating the FILESTREAM database and configuration metadata.This chapter focuses on exposing a number of FILESTREAM-related system catalog views,along with a number of useful queries that help you to retrieve various valuable pieces ofinformation about the FILESTREAM configuration. We also take a brief look at some ofthe in-depth information available from the FILESTREAM-related Dynamic ManagementViews (DMVs), and the available wait types that might help you troubleshoot variousFILESTREAM performance issues.Relevant System Catalog ViewsSQL Server has a number of FILESTREAM-related system views that we can use to findinformation about FILESTREAM tables and columns.We'll briefly review the system views that contain metadata relevant to FILESTREAMcolumns, and then take a look at some useful queries based on these views, which cangive us such information as the location of the FILESTREAM data container, the namesof all tables in a database that contain FILESTREAM columns, and so on.329

Chapter 9: Investigating FILESTREAM Databasessys.all_columns, sys.columns and sys.system_columnsThe sys.all_columns view contains the columns belonging to all the user-definedobjects and system objects in the current database.The sys.columns view is essentially a subset of sys.all_columns and containsthe columns belonging to all the table valued assembly functions, inline table valuedfunctions, internal tables, system tables, table valued functions, user tables and views,again, in the current database.Both views contain the is_filestream column, which returns 1 for FILESTREAMcolumns, and so can be used to determine all the FILESTREAM columns in the currentdatabase, as shown in Listing 9-1.SELECTs.name AS SchemaName,t.name AS TableName,c.name AS ColumnNameFROM SYS.columns cINNER JOIN sys.tables t ON t.object_id = c.object_idINNER JOIN sys.schemas s ON s.schema_id = t.schema_idWHERE is_filestream = 1/*SchemaName TableName ColumnName---------- --------- ----------dbo Items ItemImage*/Listing 9-1:Retrieving all the FILESTREAM columns from the current database.This sys.sytem_columns view contains a row for every column in all systemobjects and also has an is_filestream column which is expected to return 1 for allFILESTREAM columns. However, since none of the system tables in the SQL Serverversions up to and including 2012 have FILESTREAM columns, this column will alwayscontain 0. It is possible that a future version of SQL Server may have system tables withFILESTREAM columns.330

Chapter 9: Investigating FILESTREAM Databasessys.computed_columnsThe system view sys.computed_columns contains a record for every computedcolumn in the current database.sys.computed_columns has a column is_filestream, which indicates whetherthe column has a FILESTREAM attribute or not. Although SQL Server (2008, 2008 R2,and 2012) allows the creation of computed columns based on FILESTREAM columns,it does not allow the creation of computed columns with the FILESTREAM attribute.Therefore, the is_filestream column on every row in sys.computed_columns willalways be 0.sys.identity_columnsThis view contains a list of all the columns in the current database that are marked asIDENTITY columns. This view has a column is_filestream, which will be always 0because an IDENTITY column cannot have the FILESTREAM attribute.sys.dm_repl_traninfoThis system view returns information about each replicated transaction on a databasethat is configured for replication.This view has a FILESTREAM-related column fsinfo_address which is aVARBINARY(8) value that stores the in-memory address of the cached FILESTREAMinformation structure.331

Chapter 9: Investigating FILESTREAM Databasessys.tablesThe system catalog view sys.tables contains a row per user-defined table in the currentdatabase. The column filestream_data_space_id refers to the data space ID of aFILESTREAM filegroup or a partition scheme that consists of FILESTREAM filegroups.sys.internal_tablesThe system catalog view sys.internal_tables contains one row for every internaltable in the current database. According to Books Online, internal tables are automaticallygenerated by SQL Server to support specific features; they are generated when thosefeatures are enabled, and removed when the features are disabled. It looks like this is nottrue for the FILESTREAM-specific internal table, FILESTREAM_TOMBSTONE. It is alsofound in the model database and automatically added to every new database, whether ornot the FILESTREAM feature is enabled.This view has a number of columns that are relevant to the FILESTREAM feature.• internal_type – A TINYINT column that specifies the type of the internal tablebeing referenced by the current row. If the current database has a FILESTREAMtombstone table, it will show up in this view and the internal_type column will beset to 208. (For more information about the tombstone table, refer back to Chapter 7.)• internal_type_desc – The text description of the internal_type column. ForFILESTREAM tombstone tables, the value will be FILESTREAM_TOMBSTONE.• name – The name of the table. FILESTREAM tombstone tables are named asfilestream_tombstone_nnnnn, where nnnnn is a number that points to theobject_id of the table.• filestream_data_space_id – An integer value that is reserved for future use.332

Chapter 9: Investigating FILESTREAM Databasessys.partitionsThis view contains a row for each partition of all the tables and indexes in the givendatabase. One of the columns in this view is filestream_filegroup_id, which pointsto the filegroup ID in the sys.filegroups table.sys.data_spacesThis view returns a row for every filegroup, partition schema, and FILESTREAM filegroup.The query in Listing 9-2 retrieves all the FILESTREAM filegroups in the current database.SELECT*FROM sys.data_spacesWHERE type_desc = 'FILESTREAM_DATA_FILEGROUP'/*name data_space_id type type_desc is_default------------ ------------- ---- ------------------------- ----------NorthPole_fs 2 FD FILESTREAM_DATA_FILEGROUP 1*/Listing 9-2:Retrieving all the FILESTREAM filegroups in the current database.The following columns of this view are related to the FILESTREAM feature.• name – The name of the filegroup, partition schema, or FILESTREAM filegroup.• data_space_id – The unique data space ID.• type – The type of data space. For FILESTREAM filegroups, this will be set to FD.• type_desc – A description of the data space. For FILESTREAM filegroups, this willbe FILESTREAM_DATA_FILEGROUP.• is_default – If the filegroup is marked as the default filegroup, this will be set to 1.333

Chapter 9: Investigating FILESTREAM Databasessys.database_filesThe sys.database_files view lists all files defined in the current database. Thereare three columns of particular interest when querying for FILESTREAM-related information.First, the type and type_desc columns indicate the type of file. A type of 2or type_desc of FILESTREAM indicates a FILESTREAM data container (which, as weknow, is actually a folder in the file system and not a file). The physical_name columncontains the full path in the file system where the database file is located. In the case of aFILESTREAM data container, it is the root of the FILESTREAM data container.Upon close investigation, one will also find that the maxsize and growth columnscontain values of 0, which is never the case for data or log files.sys.filegroupsThis view inherits from sys.data_spaces and contains all the columns in the parentview and a few additional ones. The columns that are of interest for FILESTREAM arethose that are inherited from sys.data_spaces, described above.sys.system_internals_partition_columnsThis system view contains a number of internal columns copied from the catalog of therelational engine. One of the columns relevant to this discussion is the is_filestreamcolumn that specifies whether the column being described has a FILESTREAM attribute.However, the information retrieved from this view is redundant and is available in othersystem views.Books Online warns that this view is reserved for the internal system usage only and futurecompatibility is not guaranteed. This view is included for completeness only.334

Chapter 9: Investigating FILESTREAM Databasessys.system_internals_partitionsThis is another internal view that contains some FILESTREAM-related information. Twocolumns that are relevant to the FILESTREAM feature are filestream_filegroup_idand filestream_guid. filestream_filegroup_id refers to the filegroup ID of theFILESTREAM filegroup.Again, Books Online warns that this view is reserved for the internal system usageonly and future compatibility is not guaranteed. Personally, however, I found this viewvery helpful and have used it a number of times to retrieve various pieces of informationrelated to the location of the FILESTREAM folders (see, for example, Listings 9-10and 9-11).sys.filestream_tombstone_*This is an internal table that the FILESTREAM feature uses to keep track of the garbagefiles. When a garbage file is generated, it is recorded in this table and the garbage collectorreads information from the tombstone table and cleans the garbage based on it.A detailed explanation of this table is provided in Chapter 7.Useful FILESTREAM Metadata QueriesWhile working with FILESTREAM-enabled databases, you might very often come acrosscases where you need to query the FILESTREAM metadata to retrieve information. Thissection provides some FILESTREAM metadata queries, using many of the system catalogviews discussed previously, which you might find very helpful.335

Chapter 9: Investigating FILESTREAM DatabasesSQL Server instance-level queriesLet's start with a few queries that are helpful for retrieving FILESTREAM-relatedinformation on the SQL Server instance level. These make extensive use of the systemfunction, SERVERPROPERTY().Is FILESTREAM enabled?The system function SERVERPROPERTY() can be used to determine whether theFILESTREAM feature is enabled on the current SQL Server instance, as shown inListing 9-3.IF SERVERPROPERTY('FilestreamEffectiveLevel') > 0PRINT 'FILESTREAM is enabled'ELSEPRINT 'FILESTREAM is disabled'Listing 9-3:Determining whether FILESTREAM is enabled.Identify the FILESTREAM access levelThe system function SERVERPROPERTY() can be used to determine the FILESTREAMaccess level in the current SQL Server instance, as shown in Listing 9-4.SELECTSERVERPROPERTY('FilestreamConfiguredLevel')AS FilestreamConfiguredLevel,SERVERPROPERTY('FilestreamEffectiveLevel')AS FilestreamEffectiveLevel336

Chapter 9: Investigating FILESTREAM Databases/*FilestreamConfiguredLevel FilestreamEffectiveLevel------------------------- ------------------------3 3*/Listing 9-4:Identifying the FILESTREAM access level at the Windows level.Both FilestreamConfiguredLevel and FilestreamEffectiveLevel return anumeric value that indicates the current configuration value, as follows:• 0 – FILESTREAM is disabled• 1 – FILESTREAM is enabled for T-SQL access• 2 – FILESTREAM is enabled for T-SQL and Win32 streaming access (local only)• 3 – FILESTREAM is enabled for T-SQL and Win32 streaming access (local and remote)(Undocumented).The difference between FilestreamConfiguredLevel and FilestreamEffective-Level is found in how FILESTREAM requires configuration changes at the Windowslevel, through Configuration Manager, and at the SQL Server instance level throughSSMS or the sp_configure stored procedure (see Appendix A for more details).Note that at time of writing (March 2012) the description of these server properties given inBooks Online was not accurate.The FilestreamConfiguredLevel property returns the value as configured inConfiguration Manager. FilestreamEffectiveLevel returns the actual ("effective")level of FILESTREAM access that is enabled. FilestreamEffectiveLevel can be lessthan or equal to FilestreamConfiguredLevel. If the configuration at the instancelevel is different from the configuration at the Windows level, the lower access level willprevail. Let's examine this in the matrix below.337

Chapter 9: Investigating FILESTREAM DatabasesInstanceWindows0 1 2(local only)3(local and remote)0: Disabled 0 0 0 01: T-SQL access only 0 1 1 12: T-SQL and streaming access 0 1 2 3The matrix shows how the values configured at the instance level (rows) and thoseconfigured at the Windows level (columns) result in the FilestreamEffectiveLevel.You'll notice that the lower of the two always wins. For example, if the Windows configurationallows T-SQL and Win32 streaming access but the instance is configured for T-SQLaccess only, then the effective configuration is T-SQL access only. In plain terms, if thedesired access level is remote streaming access, this will need to be configured as such,both at the Windows level and at the instance level.The matrix also shows that it's not possible to control the difference between local andremote streaming access at the instance level. That difference can only be controlled atthe Windows level. (Early CTP versions of SQL Server 2008 did have that ability, but forunknown reasons it was removed before the release.)You can check the output of the following T-SQL statement to find the configured valueat the SQL Server instance level (Listing 9-5).EXEC sp_configure 'filestream_access_level'/*nameminimum maximum config_value run_value----------------------- ------- ------- ------------ ---------filestream access level 0 2 2 2*/Listing 9-5:Identifying the FILESTREAM access level at the instance level.338

Chapter 9: Investigating FILESTREAM DatabasesThe column config_value contains the current configured value at the SQL Serverinstance level:• 0 – FILESTREAM is disabled• 1 – FILESTREAM is enabled for T-SQL access• 2 – FILESTREAM is enabled for T-SQL and Win32 streaming access.Unlike the FilestreamEffectiveLevel and FilestreamConfiguredLevel serverproperties, the possible values for config_value do not include '3'.Identify the FILESTREAM share nameThe FILESTREAM share name can be retrieved from the SERVERPROPERTY() systemfunction using the FilestreamShareName parameter, as shown in Listing 9-6.SELECTSERVERPROPERTY('FilestreamShareName')AS FilestreamShare/*FilestreamShare---------------MSSQLSERVER*/Listing 9-6:Retrieving the FILESTREAM share name.Identify the FILESTREAM-enabled databasesTo determine whether a database is FILESTREAM-enabled, we need to check whether thedatabase has a FILESTREAM filegroup or not. This can be identified from the system viewsys.filegroups. The T-SQL code in Listing 9-7 checks for a FILESTREAM filegroup ineach database in the current SQL Server instance.339

Chapter 9: Investigating FILESTREAM DatabasesCREATE TABLE #db (dbname SYSNAME)EXECUTE sp_MSforeachdb 'INSERT INTO #db(dbname)SELECT ''?'' FROM [?].sys.filegroupsWHERE type = ''FD'''SELECT * FROM #dbDROP TABLE #dbListing 9-7:Identifying all the FILESTREAM-enabled databases.Database-level queriesThese queries retrieve specific FILESTREAM-related information from thecurrent database.Identifying the FILESTREAM data container locationThe path to the FILESTREAM data container(s) associated with the current databasecan be identified by looking at the physical_name column of the system view sys.database_files, as shown in Listing 9-8.SELECTdb_name() AS DatabaseName,physical_name AS FilestreamFolderFROM sys.database_filesWHERE type_desc = 'FILESTREAM'/*DatabaseName FilestreamFolder------------ --------------------------------------NorthPole C:\ArtOfFS\Demos\Chapter9\NorthPole_fs*/Listing 9-8:Identifying the FILESTREAM data container of a database.340

Chapter 9: Investigating FILESTREAM DatabasesIdentifying the tables with FILESTREAM columnsA FILESTREAM column can be identified by looking at the is_filestream column ofthe system view sys.columns, as shown in Listing 9-9.SELECT DISTINCTOBJECT_NAME(OBJECT_ID) AS TableNameFROM sys.columns cWHERE c.is_filestream = 1/*TableName---------ItemsItems2Items3Items4*/Listing 9-9:Retrieving a list of all tables with FILESTREAM fields in a database.Identifying the FILESTREAM folder names associated witheach FILESTREAM tableSQL Server creates a root folder for every FILESTREAM table in the database. Since thefolders are named using GUID values, it will often be very confusing when you haveseveral FILESTREAM tables in the database. Listing 9-10 shows a query that can be usedto determine the folder names associated with each FILESTREAM table.SELECTt.name AS TableName,d.name AS FileGroup,p.filestream_guid AS FolderNameFROM sys.tables tINNER JOIN sys.system_internals_partitions pON p.object_id = t.object_id341

Chapter 9: Investigating FILESTREAM DatabasesINNER JOIN sys.data_spaces dON d.data_space_id = t.filestream_data_space_idWHERE p.filestream_filegroup_id > 0/*TableName FileGroup FolderName--------- ------------- ------------------------------------Items1 NorthPole_fs1 041C0478-8EB6-4C65-B0AD-12E29DF69D09Items2 NorthPole_fs2 AEA2F2A4-4CA5-4790-AB16-1F78379D69C8*/Listing 9-10:Retrieving the FILESTREAM folders associated with each table in the current database.Note that, without the WHERE clause, two rows are returned for each table withFILESTREAM columns. The second row contains NULL as the filestream_guid value.There are two rows in sys.system_internals_partitions for a single table withFILESTREAM columns. The difference seems to be in the filestream_filegroup_idfield that is set to "0" for the row that has the NULL in the filestream_guid field.If the FILESTREAM table is partitioned, SQL Server will create a root folder per partitionin the table. If you have partitioned FILESTREAM tables, you might need a slightlymodified version of the above query to determine the FILESTREAM folder names of eachpartition, as shown in Listing 9-11. It is interesting to note that SQL Server does not listthe NULL row we saw earlier when the table is partitioned; therefore, this query does nothave the same WHERE clause.SELECTt.Name AS TableName,sip.partition_number AS PartitionNumber,sip.filestream_guid AS FilestreamFolderFROM sys.system_internals_partitions sipINNER JOIN sys.partitions p ON p.partition_id = sip.partition_idAND sip.filestream_guid IS NOT NULLINNER JOIN sys.tables tON t.object_id = p.object_id342

Chapter 9: Investigating FILESTREAM Databases/*TableName PartitionNumber FilestreamFolder--------- --------------- ------------------------------------Items 1 8C2E5B76-40E9-4F14-81D1-809DD2AB377EItems 2 9410B335-DF04-48B1-BC14-6B7DE5022133Items 3 5C3C61C5-978B-4014-A8BA-B35FE79127DA*/Listing 9-11:Retrieving the FILESTREAM folders of each partition.Identifying the FILESTREAM filegroup associated with eachFILESTREAM tableListing 9-12 retrieves all the FILESTREAM tables in the current database and shows theFILESTREAM filegroups associated with each table.SELECTt.name AS TableName,f.name AS FilestreamFileGroup,fl.physical_name AS FilestreamFolderFROM sys.tables tINNER JOIN sys.filegroups fON f.data_space_id = t.filestream_data_space_idINNER JOIN sys.database_files flON fl.data_space_id = f.data_space_id/*TableName FilestreamFileGroup FilestreamFolder--------- ------------------- --------------------------------------Items2 NorthPole_fs C:\ArtOfFS\Demos\Chapter9\NorthPole_fsItems3 NorthPole_fs C:\ArtOfFS\Demos\Chapter9\NorthPole_fsItems4 NorthPole_fs C:\ArtOfFS\Demos\Chapter9\NorthPole_fsItems NorthPole_fs C:\ArtOfFS\Demos\Chapter9\NorthPole_fs*/Listing 9-12:Retrieving all the FILESTREAM tables and their FILESTREAM filegroups.343

Chapter 9: Investigating FILESTREAM DatabasesIf the table is partitioned, you will need a different version of the above query to retrievethe same information. This is given in Listing 9-13.SELECTt.name AS [Table],f.name AS FilestreamFG,f.physical_name AS FilestreamFolderFROM sys.tables tINNER JOIN sys.partitions pON p.object_id = t.object_idINNER JOIN sys.database_files fON f.data_space_id = p.filestream_filegroup_idAND f.type_desc = 'FILESTREAM'/*Table FilestreamFG FilestreamFolder----- ------------- ---------------------------------------Items NorthPole_fs1 C:\ArtOfFS\Demos\Chapter7\NorthPole_fs1Items NorthPole_fs2 C:\ArtOfFS\Demos\Chapter7\NorthPole_fs2Items NorthPole_fs3 C:\ArtOfFS\Demos\Chapter7\NorthPole_fs3*/Listing 9-13:Retrieving all the FILESTREAM tables and their FILESTREAM filegroups on the partitioneddatabase (from Chapter 7).Identifying the FILESTREAM folder name associated witheach FILESTREAM columnIf a table has more than one FILESTREAM column, it will often be very confusingto identify the folder associated with each column. Listing 9-14 retrieves a list of allFILESTREAM columns in a table, along with the FILESTREAM folder associated withthe column.344

Chapter 9: Investigating FILESTREAM DatabasesDECLARE @objectID INTSELECT @objectID = OBJECT_ID('Items4')SELECTcp.name AS [ColumnName],p.partition_number AS [Partition],CAST(rs.colguid AS UNIQUEIDENTIFIER) AS [FolderName]FROM sys.sysrscols rsINNER JOIN sys.partitions p ON rs.rsid = p.partition_idAND rs.colguid IS NOT NULLAND p.object_id = @objectIDINNER JOIN sys.syscolpars cp ON cp.colid = rs.rscolidAND cp.id = @objectID/*ColumnName Partition FolderName---------- ----------- ------------------------------------ItemImage1 1BE0F0AE1-03CB-45EA-93E6-AE9CE4C3C875ItemImage2 15F14959C-36A7-4A6F-9BC7-567A9B8CB5DBItemImage3 1A505E6F6-2B34-454B-BF60-A142294349E0ItemImage4 1BB9EE893-0CDB-4849-B0FB-4E0DBCF57864ItemImage5 1D14A7768-8FCC-4B32-B776-89C2CDD8BB1F*/Listing 9-14:Retrieving the FILESTREAM folders associated with each FILESTREAM column in a table.This query tries to read from a few "internal" tables that are not accessible to regular userqueries. If you copy and paste the above query into your query window and run it, youwill receive the following error:Msg 208, Level 16, State 1, Line 4Invalid object name 'sys.sysrscols'.This happens because the view sys.sysrscols is not accessible to user mode queries.To run queries that are not accessible to user mode queries, you need to make a DAC tothe database. Connecting to SQL Server using a DAC was discussed in Chapter 7.345

Chapter 9: Investigating FILESTREAM DatabasesIdentifying the FILESTREAM file name associated with eachFILESTREAM valueSQL Server names FILESTREAM files based on the transaction LSN that created the file.SQL Server keeps track of this value internally and uses it to identify the correct file whena read/write request comes in.Though it is possible to identify the disk file associated with each FILESTREAM valueon your FILESTREAM column, it is a lengthy process. SQL Server MVP Paul Randal haswritten a detailed blog post which explains how to do this. I would recommend you take alook at it (http://brurl.com/fs9).FILESTREAM-related DMVsSQL Server has a number of FILESTREAM-related DMV views that we can use tofind very detailed information regarding FILESTREAM storage and activity againstFILESTREAM databases.sys.dm_filestream_file_io_handlesThis DMV returns all the open FILESTREAM handles which any client application hasopened using the Win32 API function OpenSqlFilestream or the SqlFilestreammanaged class. It has a number of columns providing detailed information about theopen I/O handles. You can find a complete reference of all the columns in MSDN athttp://brurl.com/fs10. The columns below may be particularly helpful.• filestream_transaction_id – Shows the ID of the transactionassociated with the given handle. This is the value returned by the346

Chapter 9: Investigating FILESTREAM DatabasesGET_FILESTREAM_TRANSACTION_CONTEXT() T-SQL function. This field canbe joined with the sys.dm_FILESTREAM_file_io_requests view to retrieveadditional information.• access_type• R: file opened for read operation• W: file opened for write operation• RW: file opened for read and write operation.• logical_path – Shows the logical path name of the file that is associated with theFILESTREAM handle. This is the same value that is returned by the PathName()method.• physical_path – shows the actual NTFS path name of the file being accessed. It isenabled by tracing flag 5556.• creation_client_process_id – This shows the ID of the Windows process whichis responsible for the FILESTREAM transaction. That can help narrow down whichapplication is misbehaving, and so on.sys.dm_filestream_file_io_requestsThis DMV displays a list of the FILESTREAM I/O requests being processed at the givenmoment. This view can be joined with sys.dm_filestream_file_io_handles onthe filestream_transaction_id column to retrieve additional information aboutthe FILESTREAM operations going on.You may want to query this DMV to retrieve extended information about theFILESTREAM processing going on your server. You can find more information in BooksOnline at http://brurl.com/fs11.347

Chapter 9: Investigating FILESTREAM Databasessys.dm_tran_active_transactionsThis DMV returns the active transactions in all the databases in the given SQL Serverinstance. If one or more FILESTREAM transactions are in progress, the filestream_transaction_id of this will be non-NULL.This view can be joined with sys.dm_filestream_file_io_handles onfilestream_transaction_id to retrieve additional information.The T-SQL code in Listing 9-15 shows how to retrieve all the active FILESTREAM transactionsin the current instance.SELECT *FROM sys.dm_tran_active_transactionsWHERE filestream_transaction_id IS NOT NULLListing 9-15:Retrieving all the active FILESTREAM transactions in the current instance.FILESTREAM and Wait TypesThe FILESTREAM feature introduced a few new system wait types to deal with thevarious FILESTREAM-related operations, available through the sys.dm_os_wait_stats DMV.FS_FC_RWLOCKThis wait type is generated by the FILESTREAM garbage collector. It occurs when garbagecollection is disabled prior to a backup or restore operation, or when a garbage collectioncycle is being executed.348

Chapter 9: Investigating FILESTREAM DatabasesFS_GARBAGE_COLLECTOR_SHUTDOWNThis wait type occurs during the cleanup process of a garbage collection cycle. It indicatesthat the garbage collector is waiting for the cleanup tasks to be completed.FS_HEADER_RWLOCKThis wait type indicates that the process is waiting to obtain access to the FILESTREAMheader file for read or write operation. The FILESTREAM header is a disk file locatedin the FILESTREAM data container and is named filestream.hdr. This file containsimportant metadata information that SQL Server uses to manage the FILESTREAM dataassociated with a given database. Unlike FILESTREAM data files, SQL Server keeps anexclusive lock on the filestream.hdr file.FS_LOGTRUNC_RWLOCKThis wait type indicates that the process is trying to access FILESTREAM log truncation,either to perform a log truncate operation, or to disable log truncation prior to a backupor restore operation.FSA_FORCE_OWN_XACTThis wait type occurs when a FILESTREAM file I/O operation needs to bind to theassociated transaction, but the transaction is currently owned by another session.349

Chapter 9: Investigating FILESTREAM DatabasesFSAGENTFSAgent is part of the FILESTREAM subsystem which sits between the storage engineand the FILESTREAM data. FSAgent manages the handling of FILESTREAM data filesin a transparent manner so that the rest of the storage engine feels that they are realVARBINARY(MAX) values. This wait type occurs when a FILESTREAM file I/O operationis waiting for FSAgent to perform an I/O operation. In the RTM version of SQL Server2008, this wait type showed up in Activity Monitor, which may make performancetroubleshooting more difficult. This issue is documented in Microsoft KB article 958942(http://brurl.com/fs12) and fixed in Cumulative Update 2 for SQL Server 2008.FSTR_CONFIG_MUTEXThis wait type occurs when there is a wait for another FILESTREAM featurereconfiguration to be completed.FSTR_CONFIG_RWLOCKThis wait type occurs when there is a wait to serialize access to the FILESTREAMconfiguration parameters.SummarySQL Server exposes a number of system catalog views, as well as DMVs, whichprovide a good deal of insight into the internal metadata of FILESTREAM storageand configuration. We have examined those views that are particularly helpful ininvestigating FILESTREAM-enabled databases.350

Chapter 10: Integrating FILESTREAMwith other SQL Server FeaturesAs we have seen in the previous chapters, the FILESTREAM feature is implementedas an additional attribute added to a VARBINARY(MAX) column. This model keepsFILESTREAM features and the complexities associated with managing BLOB datacompletely transparent to the T-SQL programming interface; T-SQL code can accessFILESTREAM data just as if it were a regular VARBINARY(MAX) column. Mostly, thismeans that existing SQL Server features that work with VARBINARY(MAX) columnswork with FILESTREAM columns without requiring modification.In this chapter, we will look at how well the FILESTREAM feature aligns with theSQL Server features below.• Replication• Log shipping• Full-text indexing• Database snapshots• Change Data Capture (CDC)• Data compression• Transparent data encryption• Database Mirroring• High Availability Disaster Recovery – Always On (HADRON).351

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesFILESTREAM and ReplicationReplication technologies provide a means to copy data from a primary database(publisher) to one or more secondary databases (subscribers). Even though replicationwasn't designed for all the functions DBAs use it for today, it is nevertheless widely usedas a solution for reporting, load balancing, synchronization, high availability, disasterrecovery, and so on. SQL Server supports replicating FILESTREAM data but, of course,FILESTREAM data is stored outside the database (in the file system) and is often used tostore very large data files, so it is interesting to examine how replication of FILESTREAMenableddatabases works.Replicating FILESTREAM columnsWhen replicating SQL Server 2008 FILESTREAM-enabled databases, FILESTREAMcolumns can be replicated either with or without the FILESTREAM attribute.If you replicate FILESTREAM columns with the FILESTREAM attribute, FILESTREAMdata values on the publisher column will go to a FILESTREAM column on the subscriber.When you replicate a FILESTREAM database to a SQL Server 2008 or later instance,in most cases you would want to replicate the FILESTREAM attribute along with theFILESTREAM columns.If you replicate FILESTREAM columns without the FILESTREAM attribute, FILESTREAMdata values on the publisher will be replicated as VARBINARY(MAX) columns on thesubscriber. In this way, it's possible to replicate a table with FILESTREAM columns to aSQL Server 2005 subscriber, which doesn't support FILESTREAM. However, it is notpossible to replicate FILESTREAM columns to SQL Server 2000 subscribers. Table 10-1summarizes the FILESTREAM replication support with previous versions of SQL Server.352

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesPublicationSQL Server2008 and laterSQL Server2005SQL Server2000 or earlierReplicate as FILESTREAM Supported Not supported Not supportedReplicate as VARBINARY(MAX) Supported Supported Not supportedTable 10-1:FILESTREAM replication support for SQL Server versions.Note that it is not possible to create a single publication that replicates FILESTREAMcolumns with the FILESTREAM attribute to SQL Server 2008 subscribers and without theFILESTREAM attribute to SQL Server 2005 subscribers.Size limits when replicating FILESTREAM columns as VARBINARY(MAX) columnsWhen replicating FILESTREAM values as VARBINARY(MAX) columns to SQL Server 2008, and later,subscribers, the maximum data size supported is 2 GB. However, replicating large data values fromSQL Server 2008 publishers to SQL Server 2005 subscribers is limited to a maximum of 256 MB of data.Replication will fail if the data value is larger.By default, FILESTREAM columns are configured to replicate as VARBINARY(MAX)values. You can change this setting from the property page of the publication using SSMS.To do this, open the publication properties dialog box by right-clicking on the publicationand selecting Properties. Then, on the Articles tab, right-click on the desired table articleand select the Set Properties menu item. Change the property value Convert filestreamto MAX data types to False (Figure 10-1).353

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesFigure 10-1:Choosing to replicate FILESTREAM columns with the FILESTREAM attribute.Click OK on the Article Properties dialog box and then again on the PublicationProperty dialog box to save the changes. You must then reinitialize the subscriptions forthe changes to take effect.You can also use T-SQL to specify that you want to replicate FILESTREAM columnswith the FILESTREAM attribute; details are provided later in the section, Managing theFILESTREAM replication flag using T-SQL.354

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesMaximum replication data sizeWhile configuring replication for FILESTREAM columns to VARBINARY(MAX) columns,one of the most important configuration parameters that needs attention is the maximumreplication data size setting. This is a SQL Server instance-level configuration which, bydefault, is set to 64 KB.Therefore, under the default maximum replication data size, and default FILESTREAMto VARBINARY(MAX) conversion settings, transactional or merge replication of aFILESTREAM () column, or any other large-value type, will fail if we attempt to storevalues larger than 64 KB. And not only the replication, but also the INSERT or UPDATEoperation will fail; SQL Server will abort the INSERT or UPDATE operation and generatethe following error:Length of LOB data (nnnn) to be replicated exceeds configured maximum 65536.The statement has been terminated.As we have seen several times in the previous chapters, FILESTREAM columns are idealonly if the values you expect to store in the FILESTREAM column are 1 MB or larger.This obviously means that when you set up transactional replication, you should eitherreplicate to FILESTREAM, or change the maximum replication data size configurationsetting so that values larger than 64 KB can be stored and replicated.To change the maximum replication size configuration setting, right-click on the SQLServer instance in SSMS Object Explorer and click Properties, navigate to the Advancedtab and edit the setting, as shown in Figure 10-2.Once the maximum replication data size setting has been increased, you will be able toreplicate FILESTREAM data values to VARBINARY(MAX) that are less than, or equal to,the size specified.355

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesFigure 10-2:Using SSMS to change the maximum data replication size.To change the maximum replication data size using T-SQL, you configure the maxtext repl size setting using the system stored procedure, sp_configure. You canspecify any value within the range of 0 to 2147483647 bytes, or simply specify 1 to instructSQL Server to impose a restriction depending on the maximum storage capacity of theassociated data type as shown in Listing 10-1. As noted earlier, this raises the size limit to2 GB when replicating to SQL Server 2008, and later, instances, but the limit remains only256 MB when replicating to SQL Server 2005 instances.356

Chapter 10: Integrating FILESTREAM with other SQL Server Featuressp_configure 'show advanced options', 1;GORECONFIGURE;GOsp_configure 'max text repl size', -1;GORECONFIGURE;GOListing 10-1:Using T-SQL to change the maximum replication data size.FILESTREAM replication and theUNIQUEIDENTIFIER columnIn Chapter 2, we discussed the fact that FILESTREAM columns can be added to atable only if it has a UNIQUEIDENTIFIER column that is marked as a ROWGUIDCOLwith a UNIQUE constraint. When replicating FILESTREAM columns with theFILESTREAM attribute, we must ensure that the UNIQUEIDENTIFIER column isalso selected for replication and that the UNIQUE constraint replicated along withthe UNIQUEIDENTIFIER column.If we remove the UNIQUEIDENTIFIER column from the publication, or decide not toreplicate the UNIQUE constraint, the FILESTREAM attribute will not be replicated to thesubscriber; it will be replicated as a VARBINARY(MAX) column, even if we set the ConvertFILESTREAM to MAX data types to FALSE.When a replication requirement exists, it is especially important to consider how togenerate the values for the UNIQUEIDENTIFIER column of FILESTREAM tables. Anew GUID value can be generated using either NEWID() or NEWSEQUENTIALID().Using NEWSEQUENTIALID() is recommended over NEWID() because it provides betterreplication performance and less fragmentation of the data. It should be noted that usingNEWSEQUENTIALID() is only supported in a DEFAULT constraint and this may not beappropriate for a particular scenario.357

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesWhen creating a merge or transactional publication with updatable subscriptions, SQLServer will automatically add a UNIQUEIDENTIFIER column, with a ROWGUIDCOLattribute if the table does not already have one. On FILESTREAM tables, SQL Server willuse the existing GUID column.When adding a FILESTREAM column to a table that is already published for mergeor transactional replication with updatable subscriptions, we don't need to add a newUINQUEIDENTIFIER column because the replication process will already have done sowhen the table was published for two-way replication. However, we must ensure that theUINQUEIDENTIFIER column has a UNIQUE constraint; if the column does not have aUNIQUE constraint, you need to add one before creating the FILESTREAM column onthe table.By default, merge replication and transactional replication are configured to replicateschema changes to the subscriber. If this setting is changed, the UNIQUE constraint willnot be replicated to the subscribers and an error will be raised when the FILESTREAMcolumn is replicated.The current schema replication settings can be reviewed in the Subscription Optionspage of the Publication Properties dialog box: right-click on the publication and selectProperties, and then select the Subscription Options tab. The Schema Replicationsection is at the bottom of the property table (Figure 10-3).358

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesFigure 10-3:Setting the Schema Replication option.We can also set this option by running the following replication stored procedures:• for merge replication – sp_helpmergepublication• for transactional replication (with or without updatablesubscriptions) – sp_helppublication.One of the columns returned by these stored procedures is replicate_ddl. If thepublication is configured to replicate schema changes, the value in this column will be 1;otherwise, it will be 0.We can use the replication stored procedures to change the replicate_ddl setting:• for merge replication – sp_changemergepublication• for transactional replication (with or without updatablesubscriptions) – sp_changepublication.359

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesListing 10-2 shows how to set this option to true using T-SQL for a merge publication.EXEC sp_changemergepublication@publication = 'northpole',@property = N'replicate_ddl',@value = 1,@force_invalidate_snapshot = 0,@force_reinit_subscription = 0;Listing 10-2:T-SQL script to update the replicate_ddl option of a publication.MSDN documentation says that, if a UNIQUE constraint has been added manually andyou wish to remove replication, the UNIQUE constraint should be removed first. If theUNIQUE constraint is not removed before attempting to drop the publication, the dropoperation will fail. I could not replicate this behavior during my testing, but keep this inmind while replicating FILESTREAM columns.Managing the FILESTREAM replication flag usingT-SQLIn the previous section we learned that a FILESTREAM column can be replicatedwith or without the FILESTREAM attribute. Then we saw how to change theconfiguration to replicate the FILESTREAM attribute or disable the replication ofthe FILESTREAM attribute using the SSMS dialog boxes. Here, we examine how theFILESTREAM replication flag can be managed using T-SQL; in other words, how toenable and disable the replication of the FILESTREAM attribute using the replicationsystem stored procedures.360

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesThe configuration settings of any article that is part of a publication can be changed usingthe stored procedures:• for merge replication – sp_changemergearticle• for transactional replication (with or without updatable subscriptions)– sp_changearticle.Both of these stored procedures accept the schema_option parameter; this parameteris a bitmap value, where every bit (or "flag") in this parameter represents the status of aspecific configuration setting, one of which determines whether or not SQL Server willreplicate the FILESTREAM attribute. The value returned for this parameter is the bitwiseOR product of one or more of the values of these bits.To replicate a FILESTREAM column along with the FILESTREAM attribute, the bitrepresented by 0x100000000 should be set; to disable replication of the FILESTREAMattribute, this bit should be cleared, being careful not to set or clear any bits other thanthe intended one.The recommended way to change the FILESTREAM replication flag is as follows:• read the existing value of the schema_option configuration setting• set or clear the FILESTREAM replication flag (0x100000000)• write the modified configuration value back to the publication.Read the current configuration settingThe first step is to read the current value of the schema_option configuration setting,using either the sp_helpmergearticle system stored procedure for merge replicationor sp_helparticle for transactional replication, as shown in Listing 10-3.361

Chapter 10: Integrating FILESTREAM with other SQL Server Features-- Look for the previous settings. Run sp_helparticle and-- look for the "schema_option" column.exec sp_helparticle 'northpole', 'items'Listing 10-3:Querying the current settings of an article in a transactional publication.Read the value of the schema_option column from the output of this statement.This can be automated by storing the output of the stored procedure in a temptable and reading from that temp table. The default value of the schema_option is0x000000000803509F.Setting the FILESTREAM replication flagHaving retrieved the current configuration value of the schema_option setting, we cango ahead and set the FILESTREAM replication flag. First, we can determine the currentstatus of the FILESTREAM flag by performing a bitwise AND operation on the currentschema_option configuration value returned by Listing 10-3, as shown in Listing 10-4.DECLARE @val BIGINT-- Assume that the current value is 0x000000010803509FSELECT @val = 0x000000010803509F-- Check whether FILESTREAM replication flag is enabledIF @val & 0x100000000 0PRINT 'FILESTREAM option is ON'ELSEPRINT 'FILESTREAM option is OFF'Listing 10-4:Checking whether the FILESTREAM replication flag is currently enabled.Having determined the current status of the FILESTREAM flag, you can set or clear theflag as required. As previously mentioned, the FILESTREAM replication setting is turnedon by setting the appropriate bit in the schema_option bitmap value. A specific bit on abyte (or byte stream) can be turned on by performing a bitwise OR operation. However, abitwise operation cannot be performed on a VARBINARY value, so we have to start with a362

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesBIGINT value, perform the required bitwise operation and then convert the new value toa VARBINARY so that a string representation of the hex value can be created from that, asshown in Listing 10-5.DECLARE@options VARBINARY(8),@prev BIGINT-- Set the flag to replicate FILESTREAM flag using a bitwise OR-- We'll assume that the previous value is 0x000000000803509FSELECT @prev = 0x000000000803509FSELECT @prev = @prev | 0x100000000SELECT @options = CAST(@prev AS VARBINARY(8))SELECT @options-- Original Value: 0x000000000803509F-- New Value : 0x000000010803509F-- Note the flag : ---------^--------Listing 10-5:Turning the FILESTREAM replication flag on by performing a bitwise OR operation.To turn off the FILESTREAM replication flag, we need to run the reverse operation and toclear a specific bit in the configuration value using a bitwise NOT operation. Listing 10-6shows how to do this.DECLARE@options VARBINARY(8),@prev BIGINT-- Clear the FILESTREAM repl flag using a bitwise NOT operation-- We'll assume that the previous value is 0x000000010803509FSELECT @prev = 0x000000010803509FSELECT @prev = @prev ^ 0x100000000SELECT @options = CAST(@prev AS VARBINARY(8))SELECT @options-- Original Value: 0x000000010803509F-- New Value : 0x000000000803509F-- Note the flag : ---------^--------Listing 10-6:Turning the FILESTREAM replication flag off by performing a bitwise NOT operation.363

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesWriting back the modified configuration valueThe final step is to write the modified configuration value back to the publication,using either the sp_changemergearticle stored procedure for merge replicationor sp_changearticle for transactional replication. This is a little more complicatedthan you might think, since the schema_option parameter is an NVARCHAR parameter,which expects a hex representation of the configuration value.Creating the string representation of a hex value is a little tricky, too. One method is touse the XML data type methods; Listing 10-7 shows how to use the xs:hexBinary()function to build a string representation of a hex value.DECLARE @optInt BIGINT-- Assume that the original value is 0x000000000803509FSELECT @optInt = 0x000000000803509F-- Turn ON the FILESTREAM replication flagSELECT @optInt = @optInt | 0x100000000-- Build a VARBINARY(8) value from the INT valueDECLARE @optHex VARBINARY(8)SELECT @optHex = CAST(@optInt AS VARBINARY(8))-- Covert the VARBINARY value to a hex stringDECLARE @optNv NVARCHAR(30)SELECT @optNv = master.dbo.fn_varbintohexstr(@OptHex)SELECT @optNv--returns '0x000000010803509F'Listing 10-7:Building a string representation of a hex value.Having built an NVARCHAR value from the modified schema_option configurationvalue, we can go ahead and execute the replication stored procedures to write the valueback to the publication, as shown in Listing 10-8.364

Chapter 10: Integrating FILESTREAM with other SQL Server Featuresexec sp_changearticle@publication = N'northpole',@article = N'Items',@property = N'schema_option',@value = @optNv,@force_invalidate_snapshot = 1,@force_reinit_subscription = 1Listing 10-8:Writing back the modified schema_option value into the publication.Impact of FILESTREAM replication configuration changesChanging the FILESTREAM replication flag requires regenerating a new replicationsnapshot and reinitializing all the subscribers. Turning the FILESTREAM replication flagoff is relatively harmless, but turning it on needs some careful thought.When you turn the FILESTREAM replication flag on:• any SQL Server 2005 subscribers to the publication will break; a publication cannotbe replicated to SQL Server 2005 subscribers if the FILESTREAM replication flag isturned on• any SQL Server 2008 subscribers will break if the subscription database is notFILESTREAM enabled and does not have a default FILESTREAM filegroup.Replication log reader and the FILESTREAM garbagecollectorAs discussed in Chapter 7, SQL Server does not allow in-place updates of FILESTREAMvalues. If you attempt to modify a FILESTREAM value either using T-SQL or using thestreaming APIs, SQL Server creates a completely new copy of the FILESTREAM data file365

Chapter 10: Integrating FILESTREAM with other SQL Server Featureswith the modified value. The old file stays around in the FILESTREAM data containeruntil the garbage collector steps in and deletes it.Every FILESTREAM operation is linked to an LSN in the database transaction log. Thegarbage collector will remove a file marked for garbage collection only when the transactionlog is truncated past the LSN that created the FILESTREAM garbage file.When a FILESTREAM database is published, the transaction log cannot be truncated untilthe replication log reader agent reads the information from the transaction log. This candelay the FILESTREAM garbage collection, because the garbage collector cannot deletethe files until the transaction log is truncated.It is therefore possible that you will notice FILESTREAM garbage files lying around for alonger period if the FILESTREAM columns are replicated and the replication log readeragent does not run frequently.FILESTREAM and synchronization methodsIt is important to note that, if a publication includes tables having FILESTREAM columns,the snapshot synchronization methods offered by SQL Server Enterprise Edition cannotbe used.SQL Server Enterprise Edition provides two efficient synchronization methods forbuilding the replication snapshot, namely database snapshot and database snapshotcharacter, which generates the snapshot based on a database snapshot, resulting in nolocks being applied on the tables when the replication snapshot is being created. Databasesnapshots cannot include FILESTREAM filegroups (explained later in this chapter) andtherefore, if the publication includes FILESTREAM columns, snapshot synchronizationmethods are not supported.366

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesReplicating databases with multiple FILESTREAMfilegroupsMost real-world FILESTREAM databases will have more than one FILESTREAMfilegroup, and it is important to understand how replication works with multipleFILESTREAM filegroups.By default, when you replicate a FILESTREAM database, all FILESTREAM data will goto the default FILESTREAM filegroup. However, we can configure the publication toreplicate FILESTREAM data to its own filegroup in the subscriber by setting another flagin the schema_option configuration value.In the previous section, we saw how to set or clear flags in the schema_optionconfiguration value. In a similar way, we can modify an article to replicateFILESTREAM data to the same filegroups in the subscriber by turning on thebit represented by 0x800000000.Note that the replication process does not create FILESTREAM filegroups on thesubscriber database. If you use this option, you must create the required FILESTREAMfilegroups on the target database before applying the initial snapshot.The option to set the FILESTREAM filegroup is available only through T-SQL. The SSMSproperty page does not have this option.367

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesFILESTREAM and the different replication typesHere, we examine briefly the behavior of FILESTREAM columns specific to eachreplication type.Transactional replicationFILESTREAM works beautifully with transactional replication (with or without updatablesubscriptions). For either type of transactional replication, setting up publishers andsubscribers for FILESTREAM columns is not much different from what we would usuallydo for any other column.One difference is the option that allows us to decide whether FILESTREAM columnsshould be replicated with the FILESTREAM attribute, which we have already discussed.Another is the option that determines whether the FILESTREAM columns should go tothe default FILESTREAM filegroup or to filegroups with the same name in the subscriber.In the Article Properties dialog box (shown in Figure 10-1), the Copy filegroup associationsproperty determines whether replicated data is stored in the default filegroupsat the subscriber or in the filegroups with matching names from the publisher. Thedefault value is False, meaning that the replicated data will be stored in the subscribers'default filegroups.Merge replicationMerge replication allows two-way synchronization of data between the publisher andsubscribers. The FILESTREAM-related configuration changes that you might need todo when setting up a merge replication are pretty much the same as what you do for atransactional replication.368

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesAside from the additional options relating to FILESTREAM columns noted in theprevious section, one specific difference when using merge replication is that if youhave SQL Server 2005 (or earlier) subscribers, the option to replicate FILESTREAM asVARBINARY(MAX) is not available in the SSMS wizard page (Figure 10-4) or the articleproperties page.Figure 10-4:Replicate FILESTREAM is not supported for merge replication.If you select SQL Server 2005, you can still replicate FILESTREAM columns; thepublication will be configured to replicate FILESTREAM columns as VARBINARY(MAX)columns and the SSMS property page will not allow you to change it. If you selectSQL Server 2000 and include FILESTREAM columns, the publication wizard will failbecause a FILESTREAM column cannot be replicated to a SQL Server 2000 database.369

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesMerge replication includes memory optimization, which is controlled bythe stream_blob_columns parameter of the sp_addmergearticle andsp_changemergearticle stored procedures. Turning this option on whilereplicating large BLOB values significantly improves replication performance. Itis therefore recommended to have this option turned on when you are replicatingFILESTREAM columns. It is especially important when the FILESTREAM column isadded to a table that is already published, because it can happen that this setting wasnot turned on when the table was originally published. In such a case, you need to turnthis flag on explicitly.The current status of this setting can be determined by running the replication storedprocedure sp_helpmergearticle and looking at the stream_blob_columns columnin the output.When using merge replication with web synchronization over HTTPS protocol, themaximum size of data values is limited to 50 MB. A runtime error will be raised if the datasize is larger than 50 MB.Snapshot replicationCreating snapshot replication for FILESTREAM columns is very similar to what we haveseen with transactional replication. One immediate difference that you might notice isthat the max text repl size option is ignored by snapshot replication. This configurationoption is explained earlier in this chapter.FILESTREAM and Log ShippingLog shipping is a popular way of ensuring high availability of databases by sendingtransaction logs to one or more secondary SQL Server servers at specified intervals.SQL Server allows log shipping of FILESTREAM-enabled databases.370

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesTo configure log shipping of FILESTREAM-enabled databases, the secondary serversmust be running on SQL Server 2008 and have FILESTREAM enabled. Beyond this, onepoint you might need to pay attention to is the connectivity between the primary andsecondary servers. Transaction logs shipped using log shipping will include FILESTREAMdata as well, which will make the shipped logs very big if your database deals with a lotof FILESTREAM operations involving large BLOB values. It is also important to understandthat, when a FILESTREAM value is updated, SQL Server creates a new file with themodified value. This new copy of the file will be included in the transaction log backupwhich has an impact on the overall size of the transaction log backups.FILESTREAM and Full-Text IndexingFull-text indexing works with FILESTREAM columns in just the same way as withVARBINARY(MAX) columns, with the only difference being that the 2 GB limit is notapplicable when indexing FILESTREAM columns.Just like VARBINARY(MAX) columns that do not have the FILESTREAM attribute,full-text indexing requires a column that stores the extension of the documents stored inthe FILESTREAM column. SQL Server looks at this column to determine the documenttype, loads the appropriate IFilter, and indexes it.More about full-text indexingA detailed explanation of full-text indexing is beyond the scope of this book. For more information, Iwould suggest you take a look at Books Online at: http://brurl.com/fs13.For a quick look at how SQL Server indexes FILESTREAM columns using full-textindexing, let's create a table with a few FILESTREAM data values, as shown inListing 10-9.371

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesUSE masterGOIF DB_ID('NorthPoleFT') IS NOT NULLDROP DATABASE NorthPoleFTGOCREATE DATABASE NorthPoleFT ON PRIMARY (NAME = NorthPoleFT,FILENAME = 'C:\ArtOfFS\Demos\Chapter10\NorthPoleFT.mdf'), FILEGROUP NorthPoleFT_fs CONTAINS FILESTREAM(NAME = NorthPoleFT_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter10\NorthPoleFT_fs') LOG ON (NAME = NorthPoleFT_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter10\NorthPoleFT_log.ldf')GOListing 10-9:Creating a sample database for full-text indexing.We'll now create a table to store some office documents, which we'll use for testing thefull-text indexing (Listing 10-10).USE NorthPoleFTGOCREATE TABLE [dbo].[Documents]([DocID] [uniqueidentifier] ROWGUIDCOLNOT NULL ,[DocName] [varchar](128) NULL ,[DocType] [varchar](10) NULL ,[DocBody] [varbinary](MAX) FILESTREAMNULL ,CONSTRAINT [PK_Documents] PRIMARY KEY CLUSTERED ( [DocID] ASC ))ON [PRIMARY] FILESTREAM_ON [NorthPoleFT_fs]Listing 10-10: Creating a table with FILESTREAM columns.372

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesAnd finally, let's insert two rows into this table: a Word document and an Excel file(Listing 10-11). If you are following the example, ensure you create these files, or changethe paths to point to files that exist on your computer.SELECT NEWID() ,'FILESTREAM Book - List of Chapters' ,'.xls' ,bulkcolumnFROM OPENROWSET(BULK 'c:\temp\FILESTREAM Book - List of Chapters.xls',SINGLE_BLOB) AS x-- Insert the data to the tableINSERT INTO Documents( [DocID] ,[DocName] ,[DocType] ,[DocBody])SELECT NEWID() ,'Chapter 10 – Integrating FILESTREAM' ,'.doc' ,bulkcolumnFROM OPENROWSET(BULK 'c:\temp\Chapter 10 - Integrating FILESTREAM.doc',SINGLE_BLOB) AS xListing 10-11: Loading documents into the FILESTREAM column of a table.It is now time to create a full-text index catalog. This can be done either using the SSMSwizard or using T-SQL (Listing 10-12).CREATE FULLTEXT CATALOGDocumentCatalogAS DEFAULT ;Listing 10-12: Creating a full-text catalog.373

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesWe can now create a full-text index on the DocBody column, in which we have stored thecontent of the Office documents (Listing 10-13).CREATE FULLTEXT INDEX ONdbo.Documents(DocName,DocBody TYPE COLUMN DocType)KEY INDEX PK_DocumentsON DocumentCatalogWITH CHANGE_TRACKING AUTO;Listing 10-13: Creating a full-text index.Note that we have specified the column that stores the document extension using theTYPE COLUMN clause. This allows SQL Server to understand the document type and usethe appropriate IFilter to index it.That is all we need to do, so Listing 10-14 shows how to search for the keywordReplication within the indexed documents.USE NorthPoleFTGOSELECT *FROM documentsWHERE CONTAINS ( DocBody, 'Replication' )Listing 10-14: Searching a full-text index.When creating a full-text index (using either T-SQL or the SSMS wizard), we can specifythe filegroup on which the index should be created. We cannot select a FILESTREAMfilegroup as the target filegroup for the full-text index. If a filegroup is not specified, thedefault data filegroup of the database is used.When using the SSMS wizard, don't select the FILESTREAM filegroup, otherwise thefull-text index creation will fail.374

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesFILESTREAM and Database SnapshotsDatabase snapshot is a very interesting feature, introduced in SQL Server 2005, providinga read-only static view of a database that exists in the same SQL Server instance. Adatabase can have multiple snapshots pointing to it, but all should reside on the sameSQL Server instance. Each database snapshot is transactionally consistent with the sourcedatabase as of the time at which the snapshot is created.SQL Server allows the creation of a database snapshot of a FILESTREAM-enableddatabase. However, the snapshot should not include the FILESTREAM filegroups. If weattempt to include a FILESTREAM filegroup in the database snapshot, the operation willfail with an error, as shown in Listing 10-15.CREATE DATABASE NorthPoleSnapshot ON PRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter10\NorthPoleSnapshot.mdf'), FILEGROUP NorthPoleSnapshot_fs CONTAINS FILESTREAM(NAME = NorthPoleSnapshot_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter10\NorthPoleSnapshot_fs') AS SNAPSHOTOF NorthPole/*Msg 1815, Level 16, State 5, Line 1The FILESTREAM property cannot be used with database snapshot files.*/Listing 10-15: Snapshot of a FILESTREAM-enabled database cannot include FILESTREAM filegroups.The T-SQL code in Listing 10-16 creates a snapshot of a FILESTREAM database withoutthe FILESTREAM filegroups.CREATE DATABASE NorthPoleSnapshot ON PRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter10\NorthPoleSnapshot.mdf') AS SNAPSHOT OF NorthPole375

Chapter 10: Integrating FILESTREAM with other SQL Server Features/*Command(s) completed successfully.*/Listing 10-16: Creating a database snapshot from a FILESTREAM-enabled database.Be careful when querying the database snapshot of a FILESTREAM database. If a querytries to read from the FILESTREAM columns, it will fail, because the FILESTREAMfilegroup is not available with the database snapshot, as shown in Listing 10-17.USE NorthPole20100817GOSELECT * FROM Items/*Msg 601, Level 12, State 3, Line 1Could not continue scan with NOLOCK due to data movement.*/Listing 10-17: Reading from FILESTREAM columns is not allowed when using a database snapshot.With the limitation that the FILESTREAM columns are not available, FILESTREAMdatabase snapshots behave like the snapshots of regular databases. Listing 10-18 readsvalues from non-FILESTREAM columns in the database snapshot.USE NorthPole20100817GOSELECT ItemNumber, ItemDescriptionFROM Items/*ItemNumberItemDescription-------------------- -------------------------MS1002Microsoft Keyboard*/Listing 10-18: Reading non-FILESTREAM columns from a database snapshot.376

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesIn most cases, it will be tedious to selectively exclude the FILESTREAM columns whilequerying a database snapshot. It would have been much easier if SQL Server simplyreturned NULL for the FILESTREAM columns instead of raising an error, but it is possiblethat this was not done because one of the main features of the FILESTREAM storage isthe transactional consistency between the relational data and FILESTREAM data.FILESTREAM and Change Data CaptureChange Data Capture (CDC) was introduced with SQL Server 2008. When this featureis enabled, SQL Server keeps track of the modifications (inserts, updates, and deletions)on the tables being tracked. The data capture takes place asynchronously (by reading thetransaction log), which eliminates the additional overhead you may have with a triggerbasedcustom data capture framework. CDC can be configured to capture data modificationsat the table level as well as at the column level.More on CDCFor a more detailed discussion of CDC, see the Microsoft article, Tuning the Performance of Change DataCapture in SQL Server 2008 at: http://brurl.com/fs25.Books Online states that CDC does not support any of the new data types introduced inSQL Server 2008. If you decide to enable CDC on your FILESTREAM-enabled tables, youmust exclude the FILESTREAM columns from the capture list.So what happens if you enable CDC on a table that contains FILESTREAM columns?CDC will capture those columns as VARBINARY(MAX) columns.377

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesA warning on enabling CDC on FILESTREAM columns!It may affect the performance of the system. In addition, it is an unsupported operation and you mightend up with a number of problems, for example, large-value type columns may generate a huge volume ofchange data if those columns are frequently inserted/updated/deleted.However, it will be interesting to see what happens when CDC is enabled on aFILESTREAM column. To start with, let us create a FILESTREAM-enabled database(Listing 10-19).USE masterGOIF DB_ID('NorthPole') IS NOT NULLDROP DATABASE NorthPoleGOCREATE DATABASE NorthPole ON PRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter10\NorthPole.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter10\NorthPole_fs') LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter10\NorthPole_log.ldf')GOUSE NorthPoleGOCREATE TABLE [dbo].[Items]([ItemID] UNIQUEIDENTIFIER ROWGUIDCOLNOT NULLUNIQUE ,[ItemNumber] VARCHAR(20) ,[ItemDescription] VARCHAR(50) ,[ItemImage] VARBINARY(MAX) FILESTREAMNULL)378

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesGOINSERT INTO Items( ItemID ,ItemNumber ,ItemDescription ,ItemImage)SELECT NEWID() ,'MS1001' ,'Microsoft Mouse' ,bulkcolumnFROM OPENROWSET(BULK 'C:\temp\MicrosoftMouse.jpg',SINGLE_BLOB) AS xListing 10-19: Creating a FILESTREAM-enabled database to test CDC.The next step is to enable CDC by running the CDC system stored procedures(Listing 10-20).USE NorthPoleGO-- Enable CDC on the NorthPole databaseEXEC sys.sp_cdc_enable_db-- Enable CDC on the Items tableEXEC sys.sp_cdc_enable_table @source_schema = 'dbo',@source_name = 'items',@role_name = 'cdc_items',@supports_net_changes = 0Listing 10-20: Enabling CDC on a table.Note that you need to have the SQL Server Agent service running to have the CDC datacaptured. If SQL Agent is not running, you will not see the data until you run SQL Agentand the capture jobs populate CDC tables with the data. To verify if the SQL Server Agentservice is running, open Control Panel | Administrative Tools | Services. Scroll down thelist of services until you see the SQL Server Agent service for your instance (the instance379

Chapter 10: Integrating FILESTREAM with other SQL Server Featuresname is between parentheses). Check the value in the Status column. If it is blank, thenthe service is stopped. Right-click the service name and select Start. If you want theservice to start automatically when Windows starts, modify the Properties of the serviceand change the Startup type to Automatic.Once CDC is enabled, we can make some data modifications and see the results in thecaptured tables (Listing 10-21).-- Update the content of the ItemImage columnUPDATE ItemsSET ItemImage = CAST(1 AS VARBINARY(MAX))-- Delete the recordDELETE FROM ItemsListing 10-21: Updating the FILESTREAM value to generate some CDC log entries.Now, let's go to the CDC tables and see whether the data has been captured(Listing 10-22).DECLARE@LSNFrom BINARY(10) = sys.fn_cdc_map_time_to_lsn('smallest greater than or equal',GETDATE() - 1) ,@LSNTo BINARY(10) = sys.fn_cdc_map_time_to_lsn('largest less than or equal',GETDATE())SELECT CASE [__$operation]WHEN 1 THEN 'Delete'WHEN 2 THEN 'Insert'WHEN 3 THEN 'Update (Old Values)'WHEN 4 THEN 'Update (New Values)'END AS Operation ,ItemNumber ,ItemDescription ,DATALENGTH(ItemImage) AS ImageSizeFROM cdc.fn_cdc_get_all_changes_dbo_Items(@LSNFrom, @LSNTo,'all update old') ;380

Chapter 10: Integrating FILESTREAM with other SQL Server Features/*OperationItemNumber ItemDescription ImageSize------------------- ---------- --------------- ---------Update (Old Values) MS1001 Microsoft Mouse 499988Update (New Values) MS1001 Microsoft Mouse 4Delete MS1001 Microsoft Mouse 4*/Listing 10-22: Querying the CDC logs for data modification history.Note that the maximum replication text size configuration setting also affects CDC. Weexamined this setting in detail when we discussed replication, earlier in this chapter.When CDC is enabled on a table, if you try to insert or update a data value larger than thesize specified, the INSERT or UPDATE operation will fail.To summarize, it is possible to enable CDC on FILESTREAM columns, but CDC willcapture the column as VARBINARY(MAX), not as FILESTREAM. Enabling CDC onFILESTREAM columns is not recommended or supported. Before moving on, let's disableCDC on the NorthPole database. Listing 10-23 shows how. It's not necessary to disableCDC on individual tables first.USE NorthPoleGO-- Disable CDC on the NorthPole databaseEXEC sys.sp_cdc_disable_dbListing 10-23: Disabling CDC on a database.381

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesFILESTREAM and Data CompressionSQL Server 2008 introduced page-level and row-level compression of relational data, butit does not compress FILESTREAM data stored in the FILESTREAM filegroup(s) of thedatabase. On the other hand, SQL Server will allow you to store the FILESTREAM datacontainer on a compressed disk volume. However, as discussed in the Data and BackupCompression for FILESTREAM Databases section in Chapter 8, documentation from theMicrosoft support engineers indicates that this is not a recommended configuration.Another compression feature introduced in SQL Server 2008 is backup compression.This feature allows you to take a compressed backup by issuing the WITH COMPRESSIONcommand in your backup script. Fortunately, SQL Server does compress theFILESTREAM data when you take a compressed backup of the database, which canpossibly save a significant amount of backup storage space and shorten the backup time.FILESTREAM and Transparent Data EncryptionOne of the very interesting data security features added in SQL Server 2008 isTransparent Data Encryption (TDE), enabling encryption of an entire database. Theencryption is transparent to application code; SQL Server takes care of encryption anddecryption at the storage level and you do not need to do anything other than enablingthis feature on the database.However, note that TDE does not encrypt the FILESTREAM data stored in aFILESTREAM database.382

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesFILESTREAM and Database MirroringSQL Server does not allow mirroring of a FILESTREAM-enabled database. We cannot adda FILESTREAM filegroup to a SQL Server database that is already mirrored. If a databasealready has a FILESTREAM filegroup, SQL Server will not allow it to be mirrored.What is very interesting is that the Mirroring option is still available in the propertypage of a FILESTREAM-enabled database. The Mirroring page will guide us step by stepthrough configuring the principal, mirror, and witness servers, and we might even geta message saying that mirroring has been successfully configured. However, when weactually start mirroring, an error is raised, as shown in Figure 10-5.Figure 10-5:Error when attempting to mirror a FILESTREAM-enabled database.383

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesIf we attempt the same using T-SQL, SQL Server will generate the following error whenthe SET PARTNER command is issued:A database cannot be enabled for both FILESTREAM storage and Database Mirroring.One of the reasons for not allowing mirroring of a FILESTREAM-enabled databasecould be the excessive amount of FILESTREAM log data that might need to beransferred to the mirror server if a lot of FILESTREAM activity takes place on thedatabase. Since SQL Server does not allow in-place update of FILESTREAM datavalues, every modification generates a new file, which has to be transferred to themirror server.FILESTREAM and High Availability DisasterRecoveryIn SQL Server 2012, the new High Availability Disaster Recovery – Always On (HADRON)feature does support creating highly available database nodes, even with FILESTREAMenableddatabases. HADRON was designed to meet the high availability needs of awide variety of SQL Server users, and so allow other features that were being used forhigh availability and disaster recovery (including replication, log shipping, and databasemirroring discussed earlier in this chapter) to focus on the problems they were designedto solve.384

Chapter 10: Integrating FILESTREAM with other SQL Server FeaturesSummaryThis chapter focused on examining how FILESTREAM works with other popular SQLServer features. Some of these features work well with FILESTREAM.• FILESTREAM works very well with replication; FILESTREAM columns can bereplicated with or without the FILESTREAM attribute, and they can be replicatedas VARBINARY(MAX) columns to SQL Server 2005 subscribers.• SQL Server supports log shipping of FILESTREAM-enabled databases. There is nochange in the configuration process except that you need to make sure that the targetSQL Server instances have FILESTREAM enabled.• Full-text indexing works with FILESTREAM columns in a similar way to that ofVARBINARY(MAX) columns, but with the 2 GB limitation of data size removed.However, some of them are only partially supported and a few of the veryinteresting features introduced in SQL Server 2005 and SQL Server 2008 donot support FILESTREAM at all.• Database snapshots of FILESTREAM databases can be created. However, youcannot include FILESTREAM filegroups in the database snapshot. When queryingthe database snapshot, if you attempt to read from the FILESTREAM columns,SQL Server will generate an error.• Change Data Capture does not "officially" support FILESTREAM columns.If you configure CDC on a FILESTREAM column, the data will be capturedas VARBINARY(MAX) columns. Enabling CDC on FILESTREAM columns isnot recommended.385

Chapter 10: Integrating FILESTREAM with other SQL Server Features• SQL Server does not allow mirroring FILESTREAM databases.• Compressing data using row-level or page-level compression does not compressFILESTREAM data.• TDE does not encrypt FILESTREAM data.However, Microsoft's commitment to providing first-class support for FILESTREAMdoes show in the new SQL Server 2012 High Availability Disaster Recovery – Always Onfeature. It can make FILESTREAM data highly available by copying it to secondary servers.386

Chapter 11: FileTableThe addition of the FileTable feature in SQL Server 2012 heralded the first significantchange to FILESTREAM since its introduction in SQL Server 2008. A FileTable is basicallyjust a "normal" SQL Server table for storing FILESTREAM data, via a VARBINARY(MAX)column called file_stream, but which also stores metadata regarding the underlyingfiles and directories. This allows Windows applications to access FILESTREAM dataexactly as if it were stored directly on the file system, and addresses the need to access theFILESTREAM data outside of the context of a SQL Server transaction. It also supportsfeatures such as full-text and semantic search on FILESTREAM data.In this chapter, we'll investigate how a FileTable is different from tables withFILESTREAM columns and how to create and access FileTables. We'll also look behindthe covers of FileTable to understand better how the FILESTREAM feature is used toimplement FileTables.Introduction to FileTableFileTable is solidly based on many concepts found in the FILESTREAM feature. Justlike normal FILESTREAM tables, FileTables are meant to store BLOB data on an NTFSvolume, while keeping that data integrated with the relational data found in the SQLServer database. FileTable also retains full compatibility with existing SQL Server tools,services, and query options. However, unlike data stored in a traditional FILESTREAMcolumn, client applications do not require modifications in order to access data stored ina FileTable. Contrast this to the requirement of using the Win32 or .NET streaming APIsin client applications that use normal FILESTREAM tables, and you'll agree that FileTableshave an advantage.387

Chapter 11: FileTableFileTables can be accessed from any Windows application, including Windows Explorer.This means that users can navigate to a file share that exists on the network and use thatfile share like they would a traditional file share hosted on a file server. Recall that the fileshare created for FILESTREAM purposes is visible when browsing to the SQL Server hoston the network, but it's not accessible.Another significant difference between FILESTREAM and FileTable is that FileTablessupport storing data in a hierarchical fashion using a folder structure. Data in aFILESTREAM-enabled table is flat unless you include some form of hierarchy in thetable's design.Beyond these differences, FILESTREAM data stored in FileTables has the samebenefits and restrictions as any other FILESTREAM data, including integrated backupand recovery.FileTable ConceptsBefore we create a FileTable, let's review some of the concepts underlying the FileTablefeature, in order to achieve a good understanding of the FileTable architecture, includingits limitations.Fixed schemaFileTables use a predefined and fixed schema, which defines, not only the characteristicsof the file data itself, but also metadata describing the underlying file or folder. Table 11-1lists every field in the FileTable schema with a brief explanation of each one. For selectedcolumns, sample values are also provided.388

Chapter 11: FileTableColumn Namestream_idfile_streamNamepath_locator(PK)parent_path_locatorColumn DescriptionThis is the rowguidcol that is required of all tableswith FILESTREAM columns.The FILESTREAM data column that stores the actualfile contents.The file or folder name.The hierarchy identifier of the file or folder. See belowfor more information about the hierarchy structure.This is the primary key of the table.This is a persisted computed column that containsthe path_locator of the parent folder. Thecomputation uses the GetAncestor() functionon the path_locator field.This must be a valid path_locator value and thisrequirement is enforced using a foreign key constraint.SampleValue(s)0x (empty file)NULL (folder)File and Folder Metadata Columnsfile_typeThis is a computed column that represents the type offile based on the file extension. The computation usesthe undocumented GETFILEEXTENSION() T-SQLfunction to determine the extension of the file.This column is useful when configuring fulltextindexes, as discussed later in this chapter.389

Chapter 11: FileTableColumn Namecached_file_sizecreation_timelast_write_timelast_access_timeis_directoryColumn DescriptionThe size of the file in bytes. This is a computedcolumn. It uses the DATALENGTH() T-SQL functionto calculate the file size.The Books Online documentation notes that, inunusual circumstances, the value of this column canget out of sync. The T-SQL DATALENGTH() functionwill always return the correct file size in bytes.The date and time the file was created, which can beset by Windows APIs.Like the next two "_time" fields below, it is stored asa datetime2(4) value, which means the time zoneoffset is included in the value.The date and time the file contents were last updated.Like creation_time and last_access_time, thisvalue can also be set by Windows APIs and thus maynot reflect the actual time.The date and time the file contents were last accessed.Just like this metadata attribute on the NTFS filesystem, it is important to note that it would beaccessing an application that is responsible forupdating this value. Many applications do notchange this value when opening a file.Indicates if the row refers to a folder instead of a file.SampleValue(s)NULL for folders≥ 0 for files0 for files1 for folders390

Chapter 11: FileTableColumn Nameis_offlineis_hiddenis_readonlyis_archiveis_systemis_temporaryColumn DescriptionReflects the NTFS extended file attribute "offline."When this attribute is set, it means that the file is onlyavailable when working online. This attribute is bestleft unmodified.Reflects the hidden attribute of the file system.Reflects the read-only attribute of the file system.Reflects the archive attribute of the file system.Usually, the archive bit is reset by full and incrementalbackup operations and set by the Windows APIs whenthe file contents are changed.It's important to note that when modifying thefile_stream value using T-SQL, the archive bit isnot set automatically by SQL Server.Reflects the system file attribute of the file system.Reflects the NTFS extended file attribute "temporary."SampleValue(s)Table 11-1:Overview of the FileTable schema.You'll notice that the majority of the columns in the schema are metadata columns. Ina traditional file system, this metadata would be stored in the file system's metadataarea (like the MFT for NTFS). This metadata is used by the SQL Server component thatexposes the FileTable's contents as a file share to set the file and folder attributes. Theydo not influence how the files can be used. For example, if the is_readonly bit is setto 1, it does not mean that SQL Server will enforce that the file contents cannot bechanged. These metadata attributes are only present to provide compatibility with theNTFS file system.391

Chapter 11: FileTableIn addition to the predefined, fixed schema of FileTables, several additional databaseobjects are created when creating FileTables. These objects include DEFAULTconstraints to set appropriate values for many of the metadata columns in the FileTable,CHECK constraints to ensure that no invalid column values can be added (such as anon-NULL file_stream column if the row refers to a folder and not a file) and UNIQUEconstraints to ensure that each full path is unique. There is even a foreign key for theparent_path_locator column to ensure that it refers to a valid folder.Creating folder structures using the HIERARCHYIDdata typeFileTable supports folder structures with a maximum depth of 15 subfolders. Even thougha maximum of 15 subfolders is supported, you will probably create a maximum of 14subfolders, because it's not possible to create a file in the 15th level subfolder (as it wouldcreate a 16th level item).The folder structure is implemented using the HIERARCHYID feature introduced withSQL Server 2008. While a complete discussion of the HIERARCHYID data type is beyondthe scope of this book, it is valuable to review some of the concepts and basic queries thatcan be performed using HIERARCHYID fields.In a FileTable, the primary key is a field of type HIERARCHYID and even though it's easyto query the parent of a given HIERARCHYID value, the parent in the hierarchy is storedin the parent_path_locator field. To keep this discussion focused on the basics ofHIERARCHYID, and since we have yet to discuss how to create FileTables, we'll use asample table that does not actually contain any FILESTREAM data, but just the hierarchystructure and a name field, as shown in Listing 11-1.392

Chapter 11: FileTableUSE NorthPoleGOCREATE TABLE Hierarchy(path_locator HIERARCHYID NOT NULLPRIMARY KEY CLUSTERED ,parent_path_locator AS ( CASE WHEN path_locator.GetLevel() = 1THEN NULLELSE path_locator.GetAncestor(1)END )PERSISTEDFOREIGN KEY REFERENCES Hierarchy ( path_locator ) ,name NVARCHAR(255) NOT NULL)Listing 11-1:Create a sample hierarchy in SQL Server.The script creates a new table containing three fields with names also found inFileTables. The path_locator is the primary key, and the parent_path_locatoris a computed column. In addition, the parent_path_locator is also a foreign keywhich references the primary key of the same table. This is to ensure we cannot add achild for a non-existing parent. The name is the would-be name of the file or folderstored in the FileTable.Listing 11-2 demonstrates how a new row can be added to the Hierarchy table. It is upto the application to create the HIERARCHYID values. A HIERARCHYID can be writtenas a string representation, whereby each level in the hierarchy is separated using '/'.A HIERARCHYID value always starts and ends with a forward slash.INSERT INTO Hierarchy( path_locator, name )VALUES ( '/1/', '1.txt' )Listing 11-2:Adding a row to the hierarchy.393

Chapter 11: FileTableWhen querying the table contents (Listing 11-3), you find that a HIERARCHYID is abinary type. Using a simple CAST, we can get back the string representation of theHIERARCHYID.SELECT path_locator ,CAST(path_locator AS NVARCHAR) 'path' ,parent_path_locator ,nameFROM Hierarchy/*path_locator path parent_path_locator name------------ ---- ------------------- -----0x58 /1/ NULL 1.txt*/Listing 11-3:Querying a HIERARCHYID.The value of parent_path_locator is NULL because this row was added at the firstlevel. Listing 11-4 shows how to add a row at the second level. The updated output of theSELECT statement from Listing 11-3 is also shown.INSERT INTO Hierarchy( path_locator, name )VALUES ( '/1/1/', '1.1.txt' )/*path_locator path parent_path_locator name------------ ----- ------------------- -------0x58 /1/ NULL 1.txt0x5AC0 /1/1/ 0x58 1.1.txt*/Listing 11-4:Adding a row to a second level.Assuming the name 1.txt refers to a file, the preceding code would be nonsensical in aFileTable scenario, because file system files cannot have children. Also, in FileTables, youwould not determine the path_locator yourself; rather, it is the result of the positionof the file or folder in the file system's structure.394

Chapter 11: FileTableMore on using hierarchiesBooks Online has a tutorial available at http://brurl.com/fs14.Root folderA single SQL Server instance can host multiple FileTables, across multiple databases, soeach database must be assigned a root folder. This root folder becomes a child folder ofthe FILESTREAM share on the SQL Server host. By default, the name of the file share isMSSQLSERVER (or the SQL Server instance name, if you use named instances).The root folder must be unique across all databases in the same instance, and can beconfigured from SSMS or using T-SQL. Using SSMS, simply right-click on the database'snode in Object Explorer and click Properties and then, in the Database Properties dialogbox, go to the Options page, locate the FILESTREAM Directory Name option and typein the folder name.Listing 11-5 shows how to set the root folder using a T-SQL script. Note that thefolder name can be different from the database name, but it must be a valid Windowsfolder name.-- Configure the root folderALTER DATABASE NorthPole SET FILESTREAM (DIRECTORY_NAME = 'NorthPole')Listing 11-5:Set the FileTable root folder name using T-SQL.This method can also be used to change the name of the root folder, if previously set. Thischange takes effect immediately, although it requires that all active connections to thedatabase are closed. You cannot reset the FILESTREAM directory name to NULL as long asthere are FileTables in the database.395

Chapter 11: FileTableNon-transactional accessTransactional access to the BLOB data stored in a FileTable is supported like it is withany FILESTREAM column. The ability to access the files and folders in the FileTable usingtraditional Windows API-based applications requires that access is also allowed outsidethe bounds of a SQL Server transaction. Access to traditional FILESTREAM columns stillrequires a transaction context for reading and writing, even in SQL Server 2012, and thisis regardless of any database-level configuration for non-transactional access.This non-transactional access is configured at the database level, meaning that we canhave multiple databases on the same SQL Server instance with different configurationsfor non-transactional access of FileTables. The options for non-transacted access are:• Off – Non-transactional access to FileTables is not allowed; any access to theFILESTREAM data stored in all FileTables in the database follows the same rules asregular FILESTREAM.• ReadOnly – Non-transactional access to FileTable data is only possible for reading;client applications cannot create or modify files and folders unless they use traditionalFILESTREAM techniques.• Full – Non-transactional access to FileTables is allowed for both reading and writing;client applications can open and save files and create folders using the standardWindows or .NET file I/O APIs.Internally, SQL Server will use the Read Committed transaction isolation level to enforcelocking or concurrency semantics.To configure non-transactional access using SSMS, open the Database Propertieswindow for the database in question, and in the Options page locate the FILESTREAMNon-Transacted Access option and select the appropriate value as shown in Figure 11-1.396

Chapter 11: FileTableFigure 11-1:Setting non-transactional access level using SSMS.After selecting the appropriate option, close the Database Properties dialog. SSMS willwarn you that, in order to change database properties, all connections to that databasemust be closed. You must click Yes in order to save your changes. If you click No, theDatabase Properties dialog is not closed, and you can cancel the changes instead.To configure non-transactional access using a T-SQL script, refer to Listing 11-6.-- Configure non-transactional access to FileTables in the current DBALTER DATABASE NorthPole SET FILESTREAM (NON_TRANSACTED_ACCESS = FULL)Listing 11-6:Configuring non-transactional access using T-SQL.The valid values for the NON_TRANSACTED_ACCESS are OFF, READ_ONLY and FULL, asdescribed previously.While non-transactional access to FileTable data is a requirement for interoperabilitywith client applications that only use Windows file I/O APIs, there is also a risk ofconsistency problems. For example, it is possible for a client application to "see" a file andread the file attributes (but not read the file contents) while a transaction that is creatingthe file is still in progress. Should the transaction roll back and the file be removed, theapplication might be acting on incorrect information. I have not found a way to avoid thispossibility, although I did have to work hard to make it happen.A noteworthy difference between non-transactional access and transactional accessis related to garbage files. Garbage files are not created when updating files usingnon-transactional access. Rather, the original FILESTREAM data file is written to directly.397

Chapter 11: FileTableThis allows for in-place (or partial) updates to the files, potentially resulting in significantperformance enhancements over other FILESTREAM scenarios. This enhancementcomes at the cost of transactional integrity. When a file is deleted using non-transactionalaccess, the FILESTREAM file will remain and follow the same rules for garbage collectionas other garbage files.FileTable namespaceThe namespace of a FileTable refers to its file and folder structure. In order to navigateto a FileTable's root folder, you need to know the machine name, the instance-levelFILESTREAM share name, the database-level folder name and the FileTable-level foldername. The root path can then be constructed like this:\\servername\instance-share\database-directory\FileTable-directorySQL Server 2012 provides several functions to obtain the correct names of the share andfolders. Assuming a default SQL Server instance with default FILESTREAM share name,a database called NorthPole with default FILESTREAM directory name, and a FileTablecalled Documents with a default FileTable directory name, the UNC path would be:\\servername\MSSQLSERVER\NorthPole\DocumentsIn order to implement an NTFS-compatible structure in SQL Server tables (referred toas "FileTable semantics" in Books Online), FileTable uses many constraints, computedcolumns and defaults.Constraints and computed columns can cause a lot of overhead if many operationsare performed on a FileTable in short order. You might be performing more operationsthan usual if the namespace of the FileTable is being reorganized, such as when you aremoving files from one folder to another. Another example might be a bulk load operation.On those occasions, you may consider disabling the constraints to achieve better398

Chapter 11: FileTableperformance while these tasks are being completed. Disabling the constraints is accomplishedby disabling the FileTable's namespace. You will need to re-enable the namespacebefore the FileTable contents will be available again.Before the FileTable's namespace will be re-enabled (and the contents be available usingWindows APIs), SQL Server will perform a consistency check on the FileTable. If theconsistency check fails, then the namespace will not be re-enabled and the inconsistencymust be fixed.To disable a FileTable's namespace in SSMS, right-click on the FileTable, open the TableProperties dialog box, and go to the FileTable page. Change the value of the FileTablenamespace enabled option to False.The code in Listing 11-7 disables and then re-enables the FileTable namespace for theDocuments table. Barring a rogue DML operation that takes place between the disableand enable operations, this simple action will not result in inconsistencies.-- Disable the Documents FileTable namespaceALTER TABLE Documents DISABLE FILETABLE_NAMESPACEGO-- Enable the Documents FileTable namespaceALTER TABLE Documents ENABLE FILETABLE_NAMESPACEListing 11-7:Disabling and enabling a FileTable's namespace using T-SQL.Since disabling the FileTable namespace disables constraints, this should not be donewithout thoroughly considering the potential implications of data consistency problems.399

Chapter 11: FileTableFileTable securityFILESTREAM data can only be accessed in the context of SQL Server, so SQL Serverexercises control over who can access the data. For FILESTREAM columns, the samepermissions can be set as for other columns, thereby allowing or denying access to rolesor users.SQL Server also maintains control over the FILESTREAM data that is stored in FileTables.As such, an important restriction on the use of FileTables must be noted: SQL Serversecurity can work at the table and column level, but not at the row level. In FileTables,each row is a file or folder and, as such, these cannot be secured individually (unlike withNTFS permissions). It is not possible to create an access control list on a file or folderstored in a FileTable, and Windows Explorer will not show the Security tab in the file orfolder properties dialog.In order for a user to have access to FileTable data using Windows APIs, theuser's Windows logon account will have to be granted permissions on all columnsin the FileTable (or the user account will need to be a member of a role that hassuch permissions).It is possible to prevent users from performing certain tasks in a FileTable by modifyingthe permissions on the FileTable. This is best accomplished by creating database roles,such as FileTable_nodelete and FileTable_noinsert, to which you can then addlogins or other roles. If you have multiple FileTables, you may consider creating a role foreach set of FileTables with specific access restrictions.Prevent creating and deleting files or foldersTo prevent users from deleting files and folders, you can explicitly deny the Deletepermission on the FileTable(s). The affected logins will be able to access and modify thefiles and folders in the FileTable as well as create new ones, but will not be able to delete400

Chapter 11: FileTableany files or folders, even those they created themselves. If a user attempts to delete a fileor folder, it will appear to succeed, but a simple refresh of the folder shows that the itemis still there. No error message is shown to the user. This, again, is a departure from theway NTFS permissions are normally set up: the CREATOR OWNER of an object usually hasfull control over that object, including delete permission.Denying the Delete permission to users is effective only to prevent accidental deletion.As we'll discover below, it's not possible to make files read-only on a user-by-user basis.That means that a user with malicious intent can still destroy the contents of the file,even if the file itself cannot be deleted.It is also possible to prevent users from creating new files and folders in a FileTable bydenying the Insert permission in a similar fashion as for the Delete permission. Whena user attempts to create a new file or folder in the FileTable, they will receive an errormessage that looks like the one below (Figure 11-2). This error message is unfortunatelynot very polished.Figure 11-2:Error message when attempting to create a new file or folder.No read-only accessUnfortunately, it is not possible to provide read-only access to certain users or roleswhile providing read and write access to others. When explicitly denying the Updatepermission on the FileTable to a role or user, users are still able to open files and folders,make changes and save those changes using Windows file I/O APIs. However, because the401

Chapter 11: FileTableUpdate permission is denied, the metadata fields in the FileTable do not get updated. So,the last_write_time and cached_file_size values of that row in the FileTablewill not reflect accurate values. At the time of going to press, Microsoft confirmed thatthis is a bug and that they intend to issue an update to address it.It is possible that these shortcomings in the security features of FileTable will be adeal breaker for some environments, and you will have to review these and discuss thepotential implications with the stakeholders.Setting Up the Server for FileTableFileTable is based on FILESTREAM, so the SQL Server machine and instance need tobe configured to support FILESTREAM. Specifically, the use of FileTables requires theFILESTREAM access level to be set to allow remote streaming access. Complete details ofhow to set up and verify FILESTREAM configuration are found in Appendix A.It is also important to note that, just like FILESTREAM, to access FileTable data using theWindows API, Windows authentication must be enabled for the SQL Server instance.Creating a Database with Support for FileTableIn order to create a FileTable, the hosting database must be configured to supportFILESTREAM and FileTable. A complete discussion about creating databases withFILESTREAM support is provided in Chapter 2.In addition to support for FILESTREAM, the database must also have the FILESTREAMDirectory Name option set to a valid folder name. It is also most likely that you willenable non-transactional access to the FileTables in your database. The T-SQL script inListing 11-8 creates a database that has support for FileTable enabled. Review the sectionon FileTable Concepts, above, to learn how to configure the FileTable settings using SSMS.402

Chapter 11: FileTable-- If the 'NorthPole' exists, drop itIF DB_ID('NorthPole') IS NOT NULL BEGINDROP DATABASE NorthPole;END-- Create NorthPole databaseCREATE DATABASE NorthPole ONPRIMARY (NAME = NorthPole,FILENAME = 'C:\ArtOfFS\Demos\Chapter11\NorthPole.mdf'), FILEGROUP NorthPole_fs CONTAINS FILESTREAM(NAME = NorthPole_fs,FILENAME = 'C:\ArtOfFS\Demos\Chapter11\NorthPole_fs' )LOG ON (NAME = NorthPole_log,FILENAME = 'C:\ArtOfFS\Demos\Chapter11\NorthPole_log.ldf')-- Configure non-transactional accessALTER DATABASE NorthPole SET FILESTREAM (NON_TRANSACTED_ACCESS = FULL)-- Configure the root folderALTER DATABASE NorthPole SET FILESTREAM (DIRECTORY_NAME = 'NorthPole')Listing 11-8:T-SQL script to create a database with FileTable enabled and configured.If you have an existing database, you can verify its readiness to support FileTableby querying the sys.database_filestream_options catalog view as shown inListing 11-9.SELECT non_transacted_access, non_transacted_access_desc,directory_nameFROM sys.database_filestream_optionsWHERE database_id = DB_ID('NorthPole')/*non_transacted_access non_transacted_access_desc directory_name--------------------- -------------------------- --------------2 FULL NorthPole*/Listing 11-9:Querying a catalog view to verify FileTable support.403

Chapter 11: FileTableIf the directory_name column is NULL, then you will need to change thatoption, using either SSMS or T-SQL. Both methods are discussed in the earlierFileTable Concepts section.Creating a FileTableWhen we have a database that meets the prerequisites, we can create our first FileTable.Create a FileTable using T-SQLA FileTable has a fixed schema, so the T-SQL statement to create one is as simple asshown in Listing 11-10.USE NorthPoleGOCREATE TABLE Documents AS FILETABLEListing 11-10: T-SQL statement to create a FileTable.You'll notice there is no need to specify field names, but we specify a new clause: ASFILETABLE. This clause indicates to SQL Server that the standard schema is to be used.With the FileTable created, use Windows Explorer to navigate to the FileTable's folder,using the path below. This path assumes that you have a default instance of SQL Server,you did not change the FILESTREAM share name and you configured the FILESTREAMdirectory name to be the same as the database name (NorthPole):\\localhost\MSSQLSERVER\NorthPole\Documents404

Chapter 11: FileTableBy default, the folder name of the FileTable is the table name. You can change the name ofthe directory (which is a subfolder of the database's root folder discussed above) by addingthe FILETABLE_DIRECTORY option in the CREATE TABLE statement (Listing 11-11).CREATE TABLE Documents AS FILETABLEWITH (FILETABLE_DIRECTORY = 'NorthPole Documents')Listing 11-11:Creating a FileTable with a specific directory name.If you need to change the directory name, you can use the ALTER TABLE statement inListing 11-12.ALTER TABLE Documents SET (FILETABLE_DIRECTORY = 'Documents')Listing 11-12: Change the directory name of a FileTable.The directory name of a FileTable must be a valid Windows directory name, which meansthat certain characters are not allowed. For example, a directory name cannot containthe pipeline character (|). While it's probably not common, it is possible in SQL Server tocreate a table with a pipeline character in the name. If you try to create a FileTable withthe name Documents|Confidential, you will receive an error message like the one below.Msg 33402, Level 16, State 1, Line 1An invalid directory name 'Documents|Confidential' was specified. Use a valid Windowsdirectory name.At that point, you will need to decide if you want to change the table name to match therequirements of Windows naming conventions or if you want to override the defaultfolder name using the FILETABLE_DIRECTORY option.The FileTable data in a new FileTable will be stored on the default FILESTREAM filegroupdefined in your database. You may have a need to place the FileTable data on a specificfilegroup that is not the default filegroup in the database. Listing 11-13 shows how this canbe done.405

Chapter 11: FileTable-- Create a new FILESTREAM filegroupALTER DATABASE NorthPoleADD FILEGROUP NorthPole_fs2 CONTAINS FILESTREAM-- Create a new FILESTREAM data containerALTER DATABASE NorthPoleADD FILE (NAME = 'NorthPole_fs2',FILENAME = 'C:\ArtOfFS\Demos\ChapterX\NorthPole_fs2')TO FILEGROUP NorthPole_fs2USE NorthPoleGO-- Place the FileTable data on a specific FILESTREAM filegroupCREATE TABLE Documents AS FILETABLEFILESTREAM_ON NorthPole_fs2Listing 11-13:Creating a FileTable that uses a specific FILESTREAM filegroup.You will notice the FILESTREAM_ON clause is the same clause you would use whenchanging the FILESTREAM filegroup on a regular table.Create a FileTable using SSMSIn other chapters, we have highlighted that it's not possible to create FILESTREAMcolumns using SSMS. However, SSMS does offer some support for creating FileTables byproviding a shortcut menu entry and Create FileTable template code.To create a FileTable using SSMS, right-click on the Tables node and select the NewFileTable item. If you already have a FileTable defined in the database, you can alsoright-click on the FileTables node. SSMS will create a new query window and include aCreate FileTable template. Unfortunately, the default template requires a lot of work tocustomize, including selecting the correct database to create the FileTable in as well asreplacing placeholder values in more than 20 locations.406

Chapter 11: FileTableIt is possible to customize the template and reduce the number of placeholders that needto be replaced. On a 64-bit system, the file containing the template can be found at:C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn\ManagementStudio\SqlWorkbenchProjectItems\Sql\Table\Create FileTable.sqlAfter creating a precautionary backup, you can edit this file to your liking. Forexample, you can remove the USE statement, change occurrences of the schema_name placeholder to dbo and comment out the parts that set the optional values forFILETABLE_DIRECTORY and FILETABLE_COLLATE_NAME.Listing 11-14 shows a suggested Create FileTable template after the edits described aboveare made.-- ==========================================-- Create FileTable template-- (customized 2012-01-06, Art of FILESTREAM)-- ==========================================IF OBJECT_ID('dbo.', 'U') IS NOT NULLDROP TABLE dbo.GOCREATE TABLE dbo. AS FILETABLE-- WITH-- (-- FILETABLE_DIRECTORY = '',-- FILETABLE_COLLATE_FILENAME = -- )GOListing 11-14: Create FileTable template with fewer placeholders.407

Chapter 11: FileTableWork with a FileTable using SSMSWhile SSMS has only limited support for creating FileTables, there are some commandsand elements that have been added to the GUI that can come in handy while workingwith FileTables.SSMS lists FileTables in their own node in the Object Explorer tree. It lists all theFileTables in the database and provides a context menu that contains the New FileTableentry. When drilling down into the FileTables node, it acts just like the Tables node(Figure 11-3).Figure 11-3:FileTables node in SSMS (trimmed full column list for brevity).Using SSMS, it is possible to change the FileTable's directory name. This change takeseffect immediately, but can only be made if no files are open in that FileTable. To changethe directory name, right-click on the FileTable and select Properties, then in theTable Properties dialog, go to the FileTable page and change the value of the FileTabledirectory name property to a valid Windows directory name (Figure 11-4).408

Chapter 11: FileTableFigure 11-4:Changing the FileTable directory name using SSMS.To help you quickly open a Windows Explorer window showing the contents of theFileTable, right-click on the FileTable name and select Explore FileTable Directory. Thismenu option will be grayed out if the database is not set up for non-transactional accessor if the FileTable's namespace is disabled.Adding constraints to the FileTableIt's not possible to create constraints on the FileTable at creation time, but you can addconstraints later, using an ALTER TABLE statement. Note that you cannot remove theconstraints that are created by default.By way of example, let's add a constraint to the FileTable that would prevent files frombeing created at the FileTable root folder. A constraint like this might be useful if youhave defined folders in the FileTable and you do not want users to create files unlessthey're in one of those subfolders. Listing 11-15 shows how to define such a constrainton a FileTable.ALTER TABLE DocumentsADD CONSTRAINT CK__Documents__NoFilesInRootCHECK(is_directory = 1 OR path_locator.GetLevel() > 1)Listing 11-15:Creating a constraint on a FileTable.409

Chapter 11: FileTableIn Listing 11-15, the constraint checks that either the value of the is_directory columnis 1 (we'll allow creating folders at any level) or that the level in the hierarchy is greaterthan 1. This will prevent users from creating files at the FileTable's root, but not at anyother level. When attempting to create a new file at the root, the error message shown inFigure 11-5 will appear.Figure 11-5:Error message when attempting to create a file at Level 1 after applying a constraint.If you have plans to apply the same constraint to multiple FileTables in the database, itwould be a good idea to extract the logical expression into a user-defined function. Fornow, however, let's remove this constraint from the table (Listing 11-16).ALTER TABLE DocumentsDROP CONSTRAINT CK__Documents__NoFilesInRootListing 11-16: Drop a user-defined constraint from a FileTable.Restrictions on creating FileTablesWhile most operations that can be performed on regular tables can be performed onFileTables, there are a few restrictions. Foremost, of course, is the notion that the schemais predefined and fixed. You cannot add or remove columns from a FileTable. Usually, it'spossible to work around that limitation by defining another table and creating a one-toonerelationship between the FileTable and the table containing the additional data.410

Chapter 11: FileTableFurthermore, it's also not possible to create a FileTable in the tempdb database or createa FileTable as a temporary table (which is expected, based on the first restriction, astemporary tables are created in tempdb). That restriction also applies to FILESTREAM ingeneral, as it's not possible to modify tempdb to add a FILESTREAM filegroup.Accessing a FileTable with Windows Explorer andOther Client ApplicationsOnce you have created a FileTable, it is ready for use. If you intend to use your FileTableto store data used by an application that is not FILESTREAM-aware (i.e. the applicationwill not be modified to support transactional access to the files), you do not need to takeany further action.By way of example, let's create a new file and folder and examine the effect in SQL Server.Using Windows Explorer, navigate to the file share where your FileTable is located. If youfollowed the example from Listings 11-8 and 11-10, this will be your path name:\\localhost\MSSQLSERVER\NorthPole\DocumentsAlternatively, you can also right-click on the FileTable in SSMS and choose theExplore FileTable Directory from the shortcut menu.Creating filesAt this time, the folder should be empty. Right-click the right pane and selectNew | Text Document. Provide a filename for the new file, such as Letter.txtand confirm.411

Chapter 11: FileTableNow open (or return to) SSMS and select all rows from the Documents FileTable. Youwill find that one row is returned. You can verify the value of the name column to makesure the correct file is returned. Also note the value of the cached_file_size column,which should read 0 (as the file is empty). A full discussion of the FileTable schema isincluded above.It is possible that you set a value for the FILESTREAM Non-Transacted Access toReadOnly. In that case, you will not be able to create new files. Windows Explorer willshow an error message like the one in Figure 11-6.Figure 11-6:Windows Explorer error message if FILESTREAMNon-Transacted Access is set to ReadOnly.The error message states that destination folder access is denied, but there really isn't apermission issue. It may be more constructive to think about it as a configuration issue.If you successfully created the new text file, open it in your favorite text editor (which islikely not FILESTREAM-aware!) and add some text to the file. Use the Save command inyour text editor to save your changes. Rerun the SELECT query in SSMS and review theresults. You will find that the cached_file_size column has been updated to reflectthe new file size. Close the text editor.412

Chapter 11: FileTableCreating foldersOne of the major benefits of FileTable is the ability to store the data hierarchically,meaning in folders. Let's create a new child folder and examine the effect in the FileTable.In Windows Explorer, right-click in the right pane, select New | Folder, provide a namefor the new folder and confirm. Rerun the SELECT statement to retrieve all rows fromthe FileTable, and note that you now have two results. The second row is your folder.The best way to distinguish between a folder and a file row in a FileTable is using theis_directory field. The value is 0 for a file and 1 for a folder. You can also check for aNULL in the file_stream column.Memory-mapped filesOne major caveat of FileTable is that memory-mapped files are not supported.A discussion of memory-mapped files is beyond the scope of this text, but sufficeit to say that some popular Windows utilities (including Notepad and Paint)use it when opening files on the local system, even if the path is a UNC path(like \\systemname\MSSQLSERVER\NorthPole). While those utilities likely donot make or break a FileTable adoption, let's look at a workaround.If you created a new, zero-length text file, you will be able to successfully open the filewith Notepad, make changes to it, and save it. However, once you close the file and thentry to reopen it, you will receive this error message from Notepad:The request is not supported.Notepad will no longer be able to open the file. This is because Notepad uses memorymappedfiles. You can work around this issue by mapping a drive letter to theFILESTREAM share. Also, when accessing the files from another host than the SQL Servermachine, memory-mapped files are not used.413

Chapter 11: FileTableEmpty database folderIf a user account has only been granted select, insert, update and/or delete permission onthe FileTables in a database, the shared folder for that database will appear empty whenbrowsing to it. When typing in the name of the folder associated with a FileTable, you cannavigate further.As an example, consider John, who has been given select permission only on the FileTablein the database NorthPole on the server SQL01. (Assume the FILESTREAM directoryname and the FileTable directory name are the same as the database and table names).When John uses Windows Explorer to navigate to the database's folder at \\SQL01\MSSQLSERVER\NorthPole, the folder will appear empty.By appending the folder name to the path \\SQL01\MSSQLSERVER\NorthPole\DocumentStore, John will have access to the FileTable.This symptom is caused by the fact that the user was not granted the Viewdefinition permission on the FileTable. The commonly used built-in database rolesdb_datareader and db_datawriter are also not granted those permissions.So, in order for your users to have a fluent experience while accessing FileTable shares,they should be given the View definition permission.Programming FileTableIn this section, we'll cover how to read and update FileTable data using application code.We'll use both T-SQL and the .NET file I/O APIs to read, create, modify, and delete filesand folders in a FileTable.414

Chapter 11: FileTableAdding rows using T-SQLAlthough it will probably be uncommon to add rows to a FileTable using T-SQL, it ispossible to do so. Applicable scenarios may include hydrating default rows to a newdatabase instance or, perhaps, the creation of new folders using your application'scustom GUI.In the next few examples, we will prepare our Documents FileTable for use by an orderentry application. We will create some files and folders in the table by using the T-SQLINSERT statement and specifying values for the required fields that don't have defaultvalues. The file we will create in the root of the FileTable is a README file, which onemight use to inform end-users about the proper use of the file share.Listing 11-17 shows the minimum required T-SQL code to create a new file in ourDocuments FileTable. Successfully executing this statement requires that you droppedthe user-defined constraint using the code in Listing 11-16.INSERT INTO Documents ([name], file_stream)VALUES ('ReadMe.txt', 0x)Listing 11-17: Creating an empty file in the root folder using T-SQL.There is no default value defined for the file_stream column, and a NULL is notallowed unless the row refers to a folder, so we specify 0x to create a zero-length file.Listing 11-18 shows how to create a new folder using T-SQL. The folder will be usedto store documents about each order that is placed. In a real-world scenario, thismight include an image of the shipping label. In our scenario, we will use only dummyfile contents.INSERT INTO Documents ([name], is_directory)VALUES ('Order Files', 1)Listing 11-18: Creating a new folder in the FileTable root folder using T-SQL.415

Chapter 11: FileTableWe may also need to create a new file in a subfolder. We must ensure that we createthis new file in a valid hierarchy, which means that we must include the folder'spath_locator in the path_locator of this new record. As an example, Listing 11-19shows a simplified view of the structure we just created using Listing 11-17 andListing 11-18.SELECT CAST(path_locator AS VARCHAR(4000)), name,path_locator.GetLevel() AS 'level', is_directoryFROM Documents/*path_locator name level is_directory------------ ----------- ----- ------------/1.1.1/ ReadMe.txt 1 0/2.2.2/ Order Files 1 1*/Listing 11-19: Querying the contents of the Documents table.While the path_locator values you will find are more complex, the point is toillustrate that the file and the folder have a different hierarchyid (path_locator)value, but that the level in the hierarchy is both 1. The new file we want to add shouldhave a path_locator and level value like the ones shown in Table 11-2.path_locator name level/2.2.2/3.3.3/ NewFile.txt 2Table 11-2:path_locator value for the new file.We must create the new file record as a child of the Order Files folder, so we cannotuse the DEFAULT constraint on the path_locator column to let SQL Server create apath_locator value located in the FileTable root. Instead, we must first find the parentfolder's path_locator value and construct a new path_locator value that is a childvalue of the parent folder's path_locator. Listing 11-20 shows how this can be done.416

Chapter 11: FileTable-- Find the path_locator value for the parent directoryDECLARE @parent_path_locator HIERARCHYIDSELECT @parent_path_locator = path_locatorFROM DocumentsWHERE name = 'Order Files'AND path_locator.GetLevel() = 1AND is_directory = 1-- Create a new path_locator value that places the new file-- as a child of the parent directoryDECLARE @path_locator VARCHAR(675)SET @path_locator = @parent_path_locator.ToString() +CONVERT(VARCHAR(20), CONVERT(BIGINT, SUBSTRING(CONVERT(BINARY(16), NEWID()), 1, 6))) + '.' +CONVERT(VARCHAR(20), CONVERT(BIGINT, SUBSTRING(CONVERT(BINARY(16), NEWID()), 7, 6))) + '.' +CONVERT(VARCHAR(20), CONVERT(BIGINT, SUBSTRING(CONVERT(BINARY(16), NEWID()), 13, 4))) + '/'-- Insert a new record, specifying the new path_locatorINSERT INTO Documents (name, file_stream, path_locator)VALUES ('OrderFile.txt', 0x, @path_locator)Listing 11-20: Create a zero-length file in a subfolder using T-SQL.We are creating the HIERARCHYID value by calling the NEWID() function three times.• Firstly, extracting a specific number of bytes from it (6 the first time, 6 the second timeand 4 the third time).• Secondly, converting the binary value to a 64-bit integer.• Thirdly, converting the integer to a variable-length string of maximum 20 characters(though the true maximum length is 15 characters the first two times and 8 charactersthe third time).Each of these values is then concatenated and separated with a period. At the end, aforward slash is added. This is the same way in which the SQL Server default constraintcreates the path_locator value. Unfortunately, the NEWID() function cannot be usedin a user-defined function. Otherwise, the smart developer would undoubtedly create417

Chapter 11: FileTablea reusable function to create a path_locator value for a new child given the parent'spath_locator.Also, Microsoft uses the NEWID() function three times in the constraint, because it's notpossible to declare a variable there to hold the value of a single call. If you plan to createa stored procedure, you might just make one call to NEWID() and store the value in atemporary variable. Listing 11-21 shows what such a stored procedure could look like.CREATE PROCEDURE dbo.CreateFileTableFile@name NVARCHAR(255),@parent_name NVARCHAR(255),@parent_level tinyintASBEGINSET NOCOUNT ON;DECLARE @parent_path_locator HIERARCHYID;DECLARE @temp_id BINARY(16);-- Find the path_locator of the parent of the new fileSELECT @parent_path_locator = path_locatorFROM DocumentsWHERE name = @parent_nameAND path_locator.GetLevel() = @parent_levelAND is_directory = 1;-- If the parent's path_locator was foundIF (@parent_path_locator IS NOT NULL)BEGINSET @temp_id = CONVERT(BINARY(16), NEWID());-- Create a new path_locator value that places the new file-- as a child of the parent directoryDECLARE @path_locator VARCHAR(675)SET @path_locator = @parent_path_locator.ToString() +CONVERT(VARCHAR(20), CONVERT(BIGINT, SUBSTRING(@temp_id, 1, 6))) + '.' +CONVERT(VARCHAR(20), CONVERT(BIGINT, SUBSTRING(@temp_id, 7, 6))) + '.' +CONVERT(VARCHAR(20), CONVERT(BIGINT, SUBSTRING(@temp_id, 13, 4))) + '/';418

Chapter 11: FileTable-- Insert a new record, specifying the new path_locatorINSERT INTO Documents (name, file_stream, path_locator)VALUES (@name, 0x, @path_locator);ENDELSEBEGIN-- Raise error because the specified parent folder does not existRAISERROR ('The parent name does not exist in the FileTable at the specifiedlevel.', 16, 1);ENDENDListing 11-21:A stored procedure to insert a new row in a FileTable.It is possible to create simpler values for the path_locator field, such as just appending1/ to the parent_path_locator value, but then your values are deviating from thedefault structure. Also, you have to provide some mechanism to ensure uniqueness of thepath_locator, which will likely involve a call to NEWID() anyway.In the same fashion, we can now create a second-level subfolder using Listing 11-22.The only difference from Listing 11-21 is found in the INSERT statement, where thefile_stream column has been replaced by the is_directory column.-- Find the path_locator value for the parent directoryDECLARE @parent_path_locator HIERARCHYIDSELECT @parent_path_locator = path_locatorFROM DocumentsWHERE name = 'Order Files'AND path_locator.GetLevel() = 1AND is_directory = 1-- Create a new path_locator value that places the new file-- as a child of the parent directoryDECLARE @path_locator VARCHAR(675)SET @path_locator = @parent_path_locator.ToString() +CONVERT(VARCHAR(20), CONVERT(BIGINT, SUBSTRING(CONVERT(BINARY(16), NEWID()), 1, 6))) + '.' +CONVERT(VARCHAR(20), CONVERT(BIGINT, SUBSTRING(CONVERT(BINARY(16), NEWID()), 7, 6))) + '.' +419

Chapter 11: FileTableCONVERT(VARCHAR(20), CONVERT(BIGINT, SUBSTRING(CONVERT(BINARY(16), NEWID()), 13, 4))) + '/'-- Insert a new record, specifying the new path_locatorINSERT INTO Documents (name, is_directory, path_locator)VALUES ('2013 Order Files', 1, @path_locator)Listing 11-22: Create a new subfolder using T-SQL.Some of the complexities in adding files and folders using T-SQL, specifically children ofother folders, may lead many developers to choose to use the Windows APIs instead. Thenext section discusses using the standard .NET System.File.IO namespace classes toachieve the same results.Using .NET file I/O APIsIn this section, we will use the traditional .NET file I/O APIs to access FileTable data.First, we will need to determine the path to the files and folders that are stored inthe FileTable. There is a new T-SQL function that allows us to get the UNC path of aFileTable's file_stream column: GetFileNamespacePath(). This function takestwo optional arguments; the first determines if a full path (which includes server name,share name and FileTable directory name) is returned or if a relative path is returned.The second argument determines the format in which the server name is returned, andis similar to the PathName() function found on a regular FILESTREAM column. (ThePathName() function works on the file_stream column, but returns the path youwould use to create a SqlFileStream instance in the context of a transaction.)Listing 11-23 shows an ADO.NET code snippet to retrieve the full path of a file whosename and location in the hierarchy is known. There are other ways to initially find thepath_locator of a file or folder, which can then be used to retrieve the UNC path.420

Chapter 11: FileTablestring SqlText = @"SELECT file_stream.GetFileNamespacePath(1)FROM DocumentsWHERE name = 'MyFile.txt'AND path_locator.GetLevel() = 1";string FilePath = null;using (SqlConnection conn = new SqlConnection("server=.;database=NorthPole;integrated security=sspi;"))using (SqlCommand cmd = new SqlCommand(SqlText, conn)){conn.Open();FilePath = (string)cmd.ExecuteScalar();}Listing 11-23: ADO.NET code snippet to retrieve full path.You'll note that this code does not use any of the FILESTREAM-related queries we've seenin previous chapters. This is a plain T-SQL query, with the only new addition the useof the GetFileNamespacePath() function. This function is discussed in more detaillater in the chapter. We can use this file path to open the file for reading and output itscontents to the console (Listing 11-22).if (FilePath != null){using (System.IO.StreamReader sr = System.IO.File.OpenText(FilePath)){Console.WriteLine(sr.ReadToEnd());}}Listing 11-24: Open a file in a FileTable for read access using .NET file I/O APIs.Listing 11-24 is not designed to demonstrate best practices in the use of the .NET file I/OAPIs. Specifically, in this example, we are reading an entire file as a string without consideringthe file size. In a real-world scenario, we might be doing this to a multi-gigabyte fileand causing significant performance issues. Using file streams would be a better practice.421

Chapter 11: FileTableYou may also want to use the file I/O APIs to create new files and folders. First, you'llneed to obtain the UNC path of a valid folder in the FileTable namespace. Listing 11-25shows how to obtain the UNC path of the FileTable root folder. It uses the T-SQLFileTableRootPath() function to retrieve the UNC path. The argument to thefunction is the name of the FileTable. If this argument is not specified, the database'sroot directory is returned. This function is discussed in more detail in the section belowon Managing FileTables.string SqlText = "SELECT FileTableRootPath('Documents')";string RootPath = null;using (SqlConnection conn = new SqlConnection("server=.;database=NorthPole;integrated security=sspi;"))using (SqlCommand cmd = new SqlCommand(SqlText, conn)){conn.Open();RootPath = (string)cmd.ExecuteScalar();}Listing 11-25: Obtaining the root folder of a FileTable in .NET.Listing 11-26 then uses this root folder value to create a folder and a file inside the newfolder. In this code snippet, the System.IO namespace is not explicitly specified for theclass names, so make sure you have a using statement (or Imports in Visual Basic) atthe top of your code file before running this code.if (RootPath != null){DirectoryInfo di = Directory.CreateDirectory(RootPath +"\\Order Files");using (StreamWriter sw = File.CreateText(di.FullName +"\\ReadMe.txt")){sw.WriteLine("Hello, World!");}}Listing 11-26: Creating a folder and file in a FileTable using .NET file I/O APIs.422

Chapter 11: FileTableAs a best practice, consider writing code that is independent from the SQL Serverconfiguration by using relative paths. If you are storing path values anywhere, store thevalue relative to the database-level directory. That way, a change to the configuration ofSQL Server will not break the application.You may be storing path values as part of an application's configuration data, whichmight allow the application to retrieve the folder where all order files are located. Toavoid storing the full path: \\servername\MSSQLSERVER\NorthPole\Documents\Order Files store instead: \Order Files. Then, when the full path is needed, call theFileTableRootPath() function and concatenate your stored value to the return valueof the function.A more comprehensive discussion of the use of .NET file I/O APIs is beyond the scope ofthis book. However, below are some limitations that exist in the use of the APIs.• It's not possible to rename the database or FileTable root directories usingthe I/O APIs.• It's not possible to obtain an exclusive handle on the database or FileTableroot directories.Managing FileTablesIn addition to the functions and views that apply to all FILESTREAM-enabled databasesand tables, there are specific functions and views added to SQL Server 2012 to support theFileTable feature.T-SQL functionsIn addition to the PathName() function, which has existed since SQL Server 2008,three new documented T-SQL functions have been added to SQL Server 2012:FileTableRootPath(), GetFileNamespacePath() and GetPathLocator().423

Chapter 11: FileTableFileTableRootPathFileTableRootPath() returns the UNC path of the root directory of the currentdatabase and, with the use of the optional argument, appends the root directory of thespecified FileTable. On a default server instance and assuming all default share anddirectory names, a call to FileTableRootPath() in the NorthPole database will yieldthis result:\\SERVERNAME\MSSQLSERVER\NorthPoleA call to FileTableRootPath('Documents') will yield this result:\\SERVERNAME\MSSQLSERVER\NorthPole\DocumentsGetFileNamespacePathThis intrinsic function returns the UNC path of the file or folder referenced by thefile_stream column. This function has two optional arguments.The first argument determines if the computer name and share name are also included.If the argument is 0 (default), only the relative path starting with the database-leveldirectory is returned. If the argument is 1, the full UNC path is returned.The second argument determines if the computer name is converted. The default valueof 0 converts the computer name to NetBIOS format, a value of 1 does not convert thecomputer name, and a value of 2 returns the fully qualified domain name of the server.(This argument and its values are identical to that found in the PathName() intrinsicfunction.)Listing 11-27 shows the use of the GetFileNamespacePath() function with the defaultargument values and with non-default argument values.424

Chapter 11: FileTableSELECT file_stream.GetFileNamespacePath() 'Default',file_stream.GetFileNamespacePath(1, 2) 'Non-default'FROM DocumentsWHERE name = 'File.txt'/*DefaultNon-default------------------- --------------------------------------------------\Documents\File.txt \\SQL.domain.us\MS…ER\NorthPole\Documents\File.txt*/Listing 11-27: Using the GetFileNamespacePath() function.GetPathLocatorThis function returns the HIERARCHYID associated with the specified, existing full path.You would use this function if you have a known FileTable path and need to locate theSQL Server data row associated with it. Books Online provides a migration scenario froma file server to FileTable as an example at http://brurl.com/fs15.In addition, several undocumented T-SQL functions have been added. These functionsare used in the constraints related to FileTables, but it appears they cannot be called fromregular T-SQL code.PathNameThe PathName() function has been around since the introduction of FILESTREAMin SQL Server 2008. The function works on the file_stream column of a FileTableand you would use its value if you use transactional access to the FileTable and youwant to use the FILESTREAM Win32 API calls. You cannot use the return value of thePathName() function to open a FileTable file for non-transactional access.425

Chapter 11: FileTableThere has been a change to the PathName() function in SQL Server 2012 to supportAvailability Groups, a new feature in the latest release. Any discussion of AvailabilityGroups is beyond the scope of this book, but PathName() now has a second optionalargument that determines whether the returned name should be the virtual networkname (VNN) (default) or the computer name.Catalog and Dynamic Management ViewsThree new catalog management views and one new DMV have been added to SQL Server2012. One catalog view has been updated to support FileTable.sys.tablesThe sys.tables catalog view has been updated to include the is_FileTable column.A value of 1 in the is_FileTable column indicates that the table is a FileTable.sys.database_filestream_optionsWe have already used this catalog view in Listing 11-9 to check the readiness of a databasefor FileTable creation. The view includes four columns.• database_id – The ID of the database within the SQL Server instance. This is usefulfor filtering by a particular database.You can retrieve the ID of a database using the DB_ID() function.• directory_name – The database-level directory name for all FileTables within thedatabase. If this value is NULL, a database-level directory has not been set and thedatabase cannot contain FileTables.426

Chapter 11: FileTable• non_transacted_access – A TINYINT representing the level of non-transactionalaccess for the database. The values can be:• 0: Non-transactional access is disabled• 1: Non-transactional access is only enabled for read-only access• 2: Non-transactional access if fully enabled• 5: Non-transactional access is in transition to read-only access• 6: Non-transactional access is in transition to disabled.• non_transacted_access_desc – The description of the value in thenon_transacted_access column. The possible values and their numericequivalents are:• NONE (0)• READ_ONLY (1)• FULL (2)• IN_TRANSITION_TO_READ_ONLY (5)• IN_TRANSITION_TO_OFF (6)sys.filetable_system_defined_objectsThis catalog view returns a list of all the system-defined objects that have been createdwhen the FileTables in the current database were created. This includes the FileTable'sprimary key, foreign key, unique and check constraints, and defaults. A FileTable has 20such system-defined objects.The view contains two columns: object_id and parent_object_id. Theobject_id column contains the object ID of the system-defined object, and theparent_object_id column contains the object ID of the FileTable. Listing 11-28 shows427

Chapter 11: FileTablehow the object name of each object can be retrieved using the OBJECT_NAME() function.On your system, you should recognize the names of the objects, although the IDs andnumeric values at the end of the names will be different.select [object_id],OBJECT_NAME([object_id]) 'name',parent_object_id,OBJECT_NAME(parent_object_id) 'parent name'from sys.filetable_system_defined_objects/*object_id nameparent_object_id parent_name--------- ------------------------------ ---------------- -----------277576027 PK__Document__5A5B77D57DF9CB0A 261575970 Documents…581577110 DF__Documents__is_te__22AA2996 261575970 Documents*/Listing 11-28: Retrieving the names of the FileTable system-defined objects.Note that the objects included in this catalog view cannot be deleted. They are anintegral part of the FileTable and will be deleted by SQL Server only if the FileTable itselfis deleted. The constraints can be disabled by disabling the FileTable namespace (seeFileTable namespace, above).sys.filetablesThis view lists information about the FileTables in the current database. The columns inthe sys.FileTables view are:• object_id – The object ID of the FileTable.• is_enabled – Returns 0 if the FileTable's namespace is not enabled and 1 if thenamespace is enabled.• directory_name – The name of the FileTable root directory.428

Chapter 11: FileTable• filename_collation_id – The ID of the SQL Server collation used to store the fileor folder name.• filename_collation_name – This column contains the name of the SQL Servercollation used to store the file or folder name.sys.dm_filestream_non_transacted_handlesThis DMV returns a list of all the non-transactional file and folder handles thatare currently open. On an active server, the results of querying this view willchange constantly.There are 32 columns in this view, and their description can be found in BooksOnline at http://brurl.com/fs16. Here, we only discuss those that are most usefulin resolving issues.• database_id – The ID of the database where the item with the open handle islocated.• object_id – The ID of the FileTable where the item with the open handle is located.• handle_id – A unique number that can be used with the sp_kill_filestream_non_transacted_handles stored procedure to kill the handle.• opened_file_name – The relative path that was originally requested to be opened.• open_time – The UTC time the handle was opened.• login_name – The user name of the user who opened the handle.Stored proceduresOne new stored procedure which has also been added is sp_kill_filestream_non_transacted_handles. This stored procedure can be used to kill handles to filesthat have been opened using non-transactional access. The stored procedure takes two429

Chapter 11: FileTableoptional arguments, table_name and handle_id. If these arguments are notprovided, or NULL is passed, all non-transactional file handles are closed. If table_nameonly is provided, all non-transactional file handles for that table are closed. Iftable_name is provided, then a handle_id can also be specified to close a specificfile. The handle IDs of open files can be obtained from the dynamic management viewsys.dm_filestream_non_transacted_handles, discussed previously.Advanced FileTable UsesIn this section, we will demonstrate the use of FileTable with two other SQL Serverfeatures that are particularly well suited to be used in conjunction with it: full-text searchand semantic search. Full-text search indexes the contents of columns of supported types(including VARBINARY(MAX) and FILESTREAM) and allows for fast searching of thosecontents. It is somewhat like the Windows Search feature. Semantic search is new in SQLServer 2012 and builds upon the full-text indexing feature.Full-text searchFileTable is ideally suited as a document storage repository, so adding a full-text index tothe table may help users locate documents. The schema of FileTables meets the requirementsfor full-text indexing.• The file contents are in a VARBINARY(MAX) FILESTREAM column.• The file extension is stored in a computed column, file_type. This column is used tomap to the correct IFilter.• There is a unique index on the FileTable that meets the requirements for full-textindexing: the primary key of the FileTable.430

Chapter 11: FileTableListing 11-29 shows how to create a full-text catalog in the database and a full-text indexon the name and file_stream columns of the Documents FileTable.-- Create a full-text catalogCREATE FULLTEXT CATALOG NorthPole_Catalog AS DEFAULT-- Create a full-text index on the FileTableCREATE FULLTEXT INDEX ON Documents(name LANGUAGE 1033,file_stream TYPE COLUMN file_type LANGUAGE 1033)KEY INDEX PK__Document__5A5B77D57DF9CB0AON NorthPole_CatalogListing 11-29: Create a full-text catalog and full-text index.Now we can query the full-text catalog for a list of files that contain the text test in thefile_stream field (Listing 11-30). The FREETEXT() function is used to specify whichcolumn we are querying and the value we are looking for.SELECT file_stream.GetFileNamespacePath(1)FROM dbo.DocumentsWHERE FREETEXT(file_stream, 'test')Listing 11-30: Query a full-text catalog.Just like with any full-text index, each indexed column has a specific language codeassociated with it (the sys.fulltext_languages catalog view returns a list ofsupported languages and their code). If your system has documents stored in differentlanguages and you want those to be indexed correctly, you will need to create a separateFileTable for each language.A more detailed discussion of full-text search, including the predicates that can be usedand tips on maintaining the index and configuring IFilters for different file types isbeyond the scope of this book.431

Chapter 11: FileTableSemantic searchSemantic search is a new feature in SQL Server 2012 and allows for writing moreadvanced queries that can include the relationship between words in the FileTablecontents. Scenarios that are enabled by this feature include the automatic extractionof key words in a document (to populate a tag cloud, for example) or the creation of adocument similarity index to compare documents with each other.An exhaustive discussion of the semantic search feature and its linguistic underpinningsis not appropriate for this book. However, we will examine how to:• enable semantic search on a FileTable which already has full-text indexing enabled• write a basic query which identifies keywords in a specified document.Prepare the server for semantic searchBooks Online has a complete discussion of the prerequisites for semantic search (http://brurl.com/fs17). During installation, the Full-Text and Semantic Extractions for Searchinstance feature must be selected.After installation, a semantic language statistics database must be installed to the systemand attached to the SQL Server instance. This database is not installed with SQL Server2012. Instead, you must execute a separate Windows Installer package to install thedatabase. Afterwards, you must attach the database to your SQL Server instance.Finally, the semantic language statistics database must be registered by calling thesp_fulltext_semantic_register_language_statistics_db stored procedure,specifying the semantic language statistics database.432

Chapter 11: FileTableLanguage supportLanguage support for semantic search is more limited than for full-text indexing. Byquerying the sys.fulltext_semantic_languages catalog view, a list of supportedlanguages can be obtained.Add a semantic search indexTo add semantic search to the full-text index created in Listing 11-29, refer to the code inListing 11-31.ALTER FULLTEXT INDEX ON DocumentsALTER COLUMN file_streamADD Statistical_SemanticsListing 11-31:Create a semantic search index on an existing full-text index.Execute a semantic queryBy way of example, we will execute a semantic query on the Documents FileTableto identify keywords in a file called Resume.docx in the root folder of the FileTable(Listing 11-32).-- Find the key value of the document to useDECLARE @path_locator HIERARCHYIDSELECT @path_locator = path_locatorFROM DocumentsWHERE name = 'Resume.docx'AND path_locator.GetLevel() = 1433

Chapter 11: FileTable-- If the document existsIF (@path_locator IS NOT NULL)-- Select the top 10 keywords from the documentSELECT TOP(10) KeyPhraseTable.keyphraseFROM SEMANTICKEYPHRASETABLE(Documents, -- the table name containing the documentfile_stream, -- the column name of the document@path_locator -- the ID associated with the document) AS KeyPhraseTableORDER BY KeyPhraseTable.score DESCListing 11-32: Writing a semantic query to determine keywords in a document.You will note that this code was written in multiple statements. This was done primarilyfor clarity, although the IF clause can be used to output a message in case the documentdoes not exist in the FileTable. Without the IF clause, the semantic search statementwould not return any results and would not provide a warning that the path_locatorvalue was not found.Practical semantic analysisYou may want to run your own résumé through this type of semantic analysis and see which keywordsare identified. You may be surprised to learn that what your résumé is seen as focusing on is somethingentirely different from what you intended. You can safely assume that employers and employment agenciesemploy similar linguistic analysis to rank your résumé against others, or to compare your résumé tojob descriptions.434

Chapter 11: FileTableInvestigating FileTableWhile end-users will view the files in the FileTables like they would view any file share,as a developer or database administrator it is still important to understand that the filecontents are actually stored in the FILESTREAM data container associated with thefile_stream column of the FileTable.The same techniques used in Chapter 9 to investigate the location of the physical fileson the SQL Server host can be used to locate the FileTable data. Note that the folderhierarchy in a FileTable is purely logical: the FILESTREAM data container is a flatstructure that contains all file data, regardless of the folder they are located in accordingto the hierarchy structure of the FileTable.The file names found in the FILESTREAM data container are LSNs, and not the filenames found in the name column of the FileTable. Again, this is consistent with howFILESTREAM is implemented and this has not changed in SQL Server 2012.The rules for updates to FILESTREAM data also apply. Most importantly, any modificationto a FileTable file results in the creation of a new file in the FILESTREAM datacontainer and the old file being marked as garbage by adding it to the FILESTREAMtombstone table.SummaryIn this chapter, we examined the newly introduced FileTable feature. Available inSQL Server 2012, FileTable expands the FILESTREAM concept, primarily by allowingnon-transactional access to the stored BLOBs and by providing a hierarchical storagemodel, similar to folders on an NTFS volume.435

Chapter 11: FileTableTo adequately understand FileTable, we first introduced several new concepts,including the fixed schema, non-transactional access, FileTable namespace and FileTablesecurity. Next, we configured the server for FileTable and created a database withFileTable support.We created a FileTable using T-SQL and examined the support for FileTables in SSMS.We used Windows Explorer to access the FileTable contents and create new files andfolders. In the Programming FileTable section, we learned how to add data to a FileTableusing T-SQL and the .NET file I/O APIs. We concluded that using the .NET file I/O APIsis preferable.Then we looked at new SQL Server objects to support FileTable, including new functions,views and stored procedures. These objects help us to manage the FileTables in ourenvironment by providing insight into the structure and contents of the FileTable data.Finally, we examined two advanced features that are supported for FileTable data:full-text search and semantic search. The schema of FileTables lends itself well to beingused with both full-text and semantic search.436

Chapter 12: Planning, Configurationand Best PracticesCorrect planning and configuration are vital ingredients in any successful technologyimplementation, and the FILESTREAM feature is no exception to this rule. To makesure the FILESTREAM feature is tuned for optimum performance and manageability,there are a number of factors to consider. The most important practices are included inthis chapter. Readers may also wish to further refer to the seminal white paper by PaulRandal: FILESTREAM Storage in SQL Server 2008 (http://brurl.com/fs7). Even thoughthe paper was written to target SQL Server 2008 and written around the original release,the information it presents is still valid and highly useful.PlanningFILESTREAM is a great feature if implemented where it is really needed, but it shouldnot be an automatic choice for every BLOB storage requirement. It is very important tounderstand this at the planning stage of your project. Keep in mind that there are caseswhere storing BLOB values in a VARBINARY(MAX) column will perform better thanstoring them in a FILESTREAM column, so we need to understand where FILESTREAM isideal and where it is not.As with many technology questions, the answer is "it depends." A Microsoft Researchstudy (conducted before the introduction of the FILESTREAM feature) investigatedthe impact of storing BLOB values in the database versus the file system, and foundthe following:• BLOB values over 1 MB in size provide best performance when stored in the file system(i.e. would benefit from storage in a FILESTREAM column).437

Chapter 12: Planning, Configuration and Best Practices• BLOB values smaller than 256 KB in size provide best performance when stored withinthe database (i.e. would benefit from storage in a VARBINARY(MAX) column).• Performance stats of BLOB values between 256 KB and 1 MB largely depend upon theway the data is used.Microsoft researchYou can download the research paper To BLOB or Not To BLOB: Large Object Storage in a Database ora Filesystem: http://brurl.com/fs2.This suggests that FILESTREAM is ideal only if the average size of BLOB values isover 1 MB. If the data is smaller, storing it in a VARBINARY(MAX) column will bemore efficient.The usage pattern and business requirements also play a crucial role in deciding whetherto choose FILESTREAM storage. For example, if the BLOB data is modified frequently(partial updates) a VARBINARY(MAX) column might give better performance than aFILESTREAM column.If the answer to the question whether or not to go with FILESTREAM is "Yes," then youneed to work on the configuration to ensure that the storage environment is optimizedfor the best FILESTREAM performance.438

Chapter 12: Planning, Configuration and Best PracticesOptimizing your Storage Configuration forFILESTREAMEarlier chapters showed how to easily configure a SQL Server instance for experimentingwith FILESTREAM storage. When it comes to setting up a production server forFILESTREAM storage, a number of additional configuration steps are recommended tomake sure that the server is tuned for optimum FILESTREAM performance.Keep FILESTREAM data containers on a separatedisk volumeFILESTREAM operations can be very I/O intensive, so it is advisable to keep theFILESTREAM data container isolated from the data and log files of your database. If youhave several FILESTREAM-enabled databases on the server, it is advisable to put eachFILESTREAM container on a separate disk volume.If you expect to store a huge number of data files in your FILESTREAM data container,you can partition the FILESTREAM table and split the FILESTREAM data across multipledisk volumes. We saw how to partition a FILESTREAM table in Chapter 2.Disabling short file namesAs we discovered in Chapter 2, for every FILESTREAM cell that is not NULL, SQL Servercreates a disk file in the columns folder of the data container. If the table is large, it isquite possible that the folders associated with the FILESTREAM columns will contain alarge number of disk files.439

Chapter 12: Planning, Configuration and Best PracticesBy default, NTFS creates a secondary file name for every file that you create in the NTFSvolume. These secondary file names are called short file names, and they follow the legacy8.3 naming convention. Creation of short file names on folders that have a large numberof files is very resource intensive, because the system needs to ensure uniqueness. It doesthis by matching the newly created name with the short file name of all the other filespresent in the folder.Short file name generation is turned on by default. However, short file names are neededonly if you expect 16-bit applications to read your files, which is highly unlikely on today'sproduction servers. In addition, the data you store in the FILESTREAM data containershould not be directly accessed by any application other than SQL Server. This meansthat you can safely turn off short file name creation on disk volumes where you store theFILESTREAM data.You can use the command line tool fsutil.exe to disable short file names on yoursystem. To first check whether it is already disabled, you can query the current setting byrunning the query shown in Listing 12-1 in the command window.fsutil.exe behavior query disable8dot3Listing 12-1:Querying the short file name setting.This will return the current value configured for this setting, as shown in Figure 12-1.Figure 12-1:The current setting for short file names.440

Chapter 12: Planning, Configuration and Best PracticesIf short file names are not disabled, you can disable them by running fsutil.exe withthe set parameter (Listing 12-2).fsutil.exe behavior set disable8dot3 1Listing 12-2:Changing the short file names setting.You must reboot your computer for the changes to take effect.While doing the research for this chapter, I came across a number of articles whichmentioned that disabling short file names does not really provide any performancebenefits. To verify this, I ran some tests which created a large number of files on a folder,and compared the performance differences.The test script created 100,000 files in 20 laps, each lap creating 5,000 text files in afolder. At first, the test script was executed with short file names enabled. The scriptwas run three times, and the average time it took to complete each lap was recorded.Then the test was run another three times with short file name generation disabled.(A server reboot was required for the change to take effect). My testing, summarized inTable 12-1, showed a considerable performance improvement after disabling 8.3 file nameson the system.LapDuration (in seconds)with short file names:File namesEnabledDisabled1 4 1 File_1_1.txt to File_1_5000.txt2 10 3 File_2_1.txt to File_2_5000.txt3 9 4 File_3_1.txt to File_3_5000.txt4 10 2 File_4_1.txt to File_4_5000.txt441

Chapter 12: Planning, Configuration and Best PracticesLapDuration (in seconds)with short file names:File namesEnabledDisabled5 11 4 File_5_1.txt to File_5_5000.txt6 15 3 File_6_1.txt to File_6_5000.txt7 21 4 File_7_1.txt to File_7_5000.txt8 51 1 File_8_1.txt to File_8_5000.txt9 106 3 File_9_1.txt to File_9_5000.txt10 3 4 File_10_1.txt to File_10_5000.txt11 4 2 File_11_1.txt to File_11_5000.txt12 4 2 File_12_1.txt to File_12_5000.txt13 5 1 File_13_1.txt to File_13_5000.txt14 7 2 File_14_1.txt to File_14_5000.txt15 7 2 File_15_1.txt to File_15_5000.txt16 10 2 File_16_1.txt to File_16_5000.txt17 10 1 File_17_1.txt to File_17_5000.txt18 13 2 File_18_1.txt to File_18_5000.txt19 14 2 File_19_1.txt to File_19_5000.txt20 3 1 File_20_1.txt to File_20_5000.txtTable 12-1:Performance test results for short file names.442

Chapter 12: Planning, Configuration and Best PracticesInitially, I was surprised that, when the test was run with short file names enabled, theduration went up with each lap (up to Lap 9) and then fell down to 3 seconds from 106.The same happened at Lap 19 too. Then I realized that the delay has to do with theamount of data NTFS has to look at before creating a new file. After Lap 9, the file namepattern changed and the same happened after Lap 19. It indicates that not only thenumber of files in the folder, but also the way files are named, affects the performancewhen short file names are enabled.SQL Server names FILESTREAM data files based on the LSN that created theFILESTREAM data file. Based on the workload and usage pattern on your server, the filenames on disk may or may not follow a specific pattern. The system overhead to createshort file names depends on the number of files with similar names in the same folder.If the file names created are similar, you might notice a performance difference afterdisabling short file names. It is important to note that the impact of short file namescomes into the picture only when new files are created. If your application does not createnew files, the impact of the change may often not be noticeable.Compressing FILESTREAM dataWhen data is compressed, the I/O pressure on the system is reduced because less datais read from, and written to, the disk. On the other hand, CPU usage increases becausethe storage engine has to compress the data before writing to the disk, and decompressit on reading. Depending upon your workload, CPU, and I/O subsystem capabilities,compression may or may not help. In most cases, I have seen that compression helps ifthe data is compressible.The SQL Server row- and page-level compression feature does not compressFILESTREAM data files. However, SQL Server allows FILESTREAM data files to be storedon compressed disk volumes. This will reduce the storage space needed for the data. Thedecision to go with compressed FILESTREAM storage should be taken only after clearly443

Chapter 12: Planning, Configuration and Best Practicesanalyzing the workload, system configuration, the type of BLOB values to be processed,and so on. Some file types use a compressed format, so trying to compress them again willadd overhead without benefits.Configuring NTFS cluster sizeIt is recommended that you format your NTFS volume with a larger-than-defaultcluster size (allocation unit). The default cluster size in Windows 2000, Windows Server2003, and Windows Server 2008 is 4 KB. The recommended cluster size for optimumFILESTREAM performance is 64 KB, which is the maximum cluster size. If you intend touse NTFS file compression, you must keep the cluster size to 4 KB; NTFS file compressionis not possible on volumes formatted with a cluster size greater than 4 KB. You can checkthe current cluster size of your disk by running chkdsk.exe without any parameters atthe command prompt (Figure 12-2).Figure 12-2:Running chkdsk to see the current cluster size.444

Chapter 12: Planning, Configuration and Best PracticesIt is important to consider the cluster size of your volume during the planning phase ofthe FILESTREAM implementation because, unless trusted third-party tools are used, it isnot possible to change the cluster size without formatting the volume (and thus erasingall data).Disabling the indexing serviceIndexing usually slows down common file operations such as opening and closing.Because FILESTREAM data files should be accessed only through SQL Server, it isrecommended that you disable indexing on the NTFS volumes where FILESTREAMdata is stored.Disabling indexing on an NTFS volume is quite easy. You can do this by right-clicking onthe NTFS volume and selecting Properties. In the General tab, you can see an option toenable or disable the indexing service on the selected disk volume (Figure 12-4).Figure 12-3:Disabling indexing on an NTFS volume.445

Chapter 12: Planning, Configuration and Best PracticesConfiguring antivirus"Should I configure my antivirus software to scan the FILESTREAM data?" is a very interestingquestion. The answer depends upon various factors specific to the environment,such as the source of data, or the way FILESTREAM files are processed. If you are storingdata that originated from untrusted sources (for example, end-user file uploads, perhapseven by users outside your own organization), you may want to enable virus scans.However, it is important to recognize that virus scanning software may not be 100%effective because it will not be able to determine the file type of the stored data if thescanner uses the file extension for that purpose.If you decide to configure your antivirus software to scan the FILESTREAM datacontainer, make sure you do not configure it to delete the infected files. If the antivirussoftware deletes an infected file from the FILESTREAM data container, you will end upwith a corrupted database and it will be hard for you to figure out what went wrong andsubsequently recover the database.It is therefore recommended that you configure your antivirus software to quarantinethe infected files rather than deleting them. If a FILESTREAM data file is infected andthe antivirus software moves it to a quarantine folder, you can run DBCC CHECKDB toidentify the missing files in the FILESTREAM data container and match it with the filesin the quarantine folder. Chapter 7 included the options for recovering from missingFILESTREAM data files.You may also consider an alternative approach, in which the virus scanning is done atthe time of saving the FILESTREAM data. Many antivirus software vendors provideSDKs, which expose API functions that you can use to scan a memory stream. Suchan API function can be called from your .NET or Win32 code just before writing theFILESTREAM data. This approach is most effective if your application writes dataexclusively using the streaming I/O APIs and not using T-SQL statements.446

Chapter 12: Planning, Configuration and Best PracticesDisabling the Last Access Time attributeEach file in an NTFS volume has an attribute named Last Access Time. By default,NTFS tries to keep this attribute accurate by updating it every time the file is accessed.It means that even a read operation will result in a disk write operation (to update theLast Access Time attribute). Disabling this attribute on the FILESTREAM volume willgive you considerable performance benefits if you deal with a large number of files thatare frequently read. When the Last Access Time update is disabled, this attribute willshow the same value as the File Creation Time.You can disable Last Access Time by running fsutil.exe on computers running onWindows Server 2003 and later. You might want to first check the current setting of thisconfiguration value. You can do it by running fsutil.exe in query mode (Listing 12-3).fsutil.exe behavior query disablelastaccessListing 12-3:Querying the Last Access Time setting.If Last Access Time is not disabled, you will see a message that readsdisablelastaccess is not currently set.If the system is configured to disable the update of Last Access Time, yourquery will return disablelastaccess = 1.If you see that your server is currently configured to update Last Access Time,you can disable it by running fsutil.exe with the set parameter (Listing 12-4).fsutil.exe behavior set disablelastaccess 1Listing 12-4:Disabling Last Access Time update.447

Chapter 12: Planning, Configuration and Best PracticesConfiguring the disks with the correct RAID levelConfiguring the correct RAID level is a complex subject, and a detailed discussion isbeyond the scope of this chapter. Depending upon the environment, type of applications,and policies, the DBA team and storage administrators need to work together to identifythe correct RAID level for the given environment and the performance objectives.A detailed discussion of the RAID configuration is beyond the remit of this book. Below,however, are some general guidelines.1. RAID 5 + striping (also referred to as RAID 50) is one of the best options, and givesvery good read and write performance as well as good fault tolerance. However, this isone of the most expensive options because it requires the most disks to implement.2. RAID 0 gives very good read and write performance, is cheaper thanRAID 5 + striping, but provides no fault tolerance.3. RAID 5 gives very good fault tolerance, but read and write operations are notas fast as with the other options.Just like the other configuration parameters, correct planning before implementingFILESTREAM is crucial for ensuring optimum performance.More on RAID configurationsFor a more detailed discussion of the different RAID configurations, see the Wikipedia articles on RAID athttp://brurl.com/fs19, and http://brurl.com/fs20.448

Chapter 12: Planning, Configuration and Best PracticesRegular disk defragmentationOne of the regular database maintenance activities that you need to plan forFILESTREAM databases is defragmentation of the FILESTREAM disk volumes. If theFILESTREAM data files are large or frequently modified, the chances of fragmentation arehigher, and regular defragmentation of the disk will improve performance.Setting up FILESTREAM for Remote AccessRemote access to FILESTREAM data is one of the most misunderstood terms associatedwith FILESTREAM storage. Remote access to FILESTREAM data means "accessingFILESTREAM data from a client application that resides on a different computer fromwhere the SQL Server instance is installed."When you enable FILESTREAM for I/O streaming access in the FILESTREAM configuration,FILESTREAM data is accessible from client applications running on the samecomputer; an application running on the same computer as a SQL Server 2008 instancecan access the FILESTREAM data either using the Win32 API or the Managed API exposedby SQL Server.There is a further option, to allow remote clients to have streaming access toFILESTREAM data on the FILESTREAM configuration page of the SQL Server installer.You might have also noticed it in the FILESTREAM tab or the property sheet of theSQL Server instance opened from SQL Server Configuration Manager. In most real-worldscenarios, the database and application will reside on different computers, therefore youshould enable this option. Appendix A discusses fully the configuration of remote access.Depending upon the security configuration of your network, you may or may not requireadditional changes to access FILESTREAM data remotely.449

Chapter 12: Planning, Configuration and Best PracticesAccess to FILESTREAM data from a remote computer uses the Server Message Block(SMB) protocol. SMB is a protocol used for sharing files and other resources such asprinters between computers. The default port number used by SMB is 445. You need tomake sure that the firewall is configured to allow access to these ports.Configuring the client computer for remote accessNote that streaming access to the FILESTREAM data is available only when the clientis authenticated using Windows authentication. So, when accessing FILESTREAM dataremotely, it is very important to ensure that the Windows user account your applicationuses on the other computer has adequate permissions to log in to the SQL Server instanceand access the FILESTREAM table using Windows authentication (Integrated Security).As on the server side, to use FILESTREAM in a firewall-protected environment, the clientmust be able to resolve DNS names to the server that contains the FILESTREAM files.Configuring the client application for remote accessOn the application side, accessing FILESTREAM data remotely is no different fromaccessing it from the same computer. In Chapter 3, we saw several examples that demonstratedhow to access the FILESTREAM data using the streaming APIs. You should be ableto run those examples successfully against a remote SQL Server instance if FILESTREAMhas been correctly configured for remote access.MSDN documentation recommends doing fewer updates with bigger data chunks whenaccessing FILESTREAM data remotely. The recommended buffer size is 60 KB for theoptimum performance.450

Chapter 12: Planning, Configuration and Best PracticesBest Practices for FILESTREAM DevelopmentIn this section, we have included all best practices for developers working withFILESTREAM. Some of these practices are found in other chapters of the text as well, butthis section may provide value for quickly evaluating an application's design or for useduring a code review.Use FILESTREAM streaming APIs to read and writeFILESTREAM dataAs we have seen, reading and writing FILESTREAM data using T-SQL is not efficientin most cases because, when FILESTREAM data is accessed using T-SQL, streamingcapability is not available. In addition, T-SQL access uses SQL Server's memory forbuffering the FILESTREAM data, which can significantly add to the memory pressureon the server.For best performance, always use the Win32 or Managed API for reading and writingFILESTREAM data.Avoid small partial updatesFILESTREAM does not support in-place update of the BLOB data. When you modify aFILESTREAM value, SQL Server will create a new copy of the entire file with the modifiedcontent and discard the old file. A new copy of the file needs to be created even for a smallchange in the FILESTREAM data file.If your application requires modification of data, you might want to review the expectedvolume of changes when you are deciding whether FILESTREAM storage is ideal. Iffrequent modifications are required, you might benefit more from keeping the data in a451

Chapter 12: Planning, Configuration and Best PracticesVARBINARY(MAX) column. To help you make this decision, do some performance teststo compare FILESTREAM and VARBINARY(MAX) storage options based on your specificworkload and usage pattern.Keep an additional column to store the type of fileor extensionIn many cases, a FILESTREAM column is expected to store more than one type of file. Forexample, a product catalog application might store multiple types of files (.jpg, .png,etc.) in the image column, and a document management application might store multipletypes of documents (Word, Excel, etc.), and so on.An additional column that specifies the file type can be beneficial. For example, inChapter 10 we saw that full-text indexing expects an additional column in the tablethat stores the file extension. Based on the file type, the index component identifies theIFilter for the file and indexes the content. Even if you do not use full-text indexing, itmay be helpful to keep a file type column, for example to identify the content type to theclient application.Identifying the type of file from a file headerMost file types have a header section that stores various pieces of information about thecontent of the file. For example, most image file types keep a file header that specifies thetype of file, dimension of the image, and so on.While it is recommended that you keep an additional column that specifies the file type,in the real world you may end up in a situation where there is no such column. Forexample, assume that the application stores different types of images in the FILESTREAMcolumn of a product catalog table. In the absence of an extra column which stores452

Chapter 12: Planning, Configuration and Best Practicesthe file type or extension, you might need to read the file header (first few bytes of theFILESTREAM data) to determine the type of file.When only a small piece of data is to be read from the beginning of the file, T-SQLaccess is found, in most cases, to be more appropriate and efficient than Win32 orManaged API access.The code snippet in Listing 12-5 reads the header information from the image files storedin a FILESTREAM column and identifies the file types.SELECTItemNumber,ItemDescription,CASEWHEN SUBSTRING(itemimage, 1, 2) = 0XFFD8 THEN 'JPEG'WHEN SUBSTRING(itemimage, 1, 2) = 0X424D THEN 'BMP'WHEN SUBSTRING(itemImage, 1, 8) = 0x89504E470D0A1A0A THEN 'PNG'ELSE 'Other'END AS FileTypeFROM Items/*ItemNumber ItemDescription FileType---------- --------------- --------MM2001 Dell Laptop JPEGMS1002 Microsoft Mouse PNGMS1005 TVS Key Board PNGMS1007 XBOX 360 BMP*/Listing 12-5:Identifying file types from the file headersIn the above example, the column FileType shows the type of images stored in theFILESTREAM column by reading the header information stored in the first few bytes ofthe BLOB value.453

Chapter 12: Planning, Configuration and Best PracticesAdd a content-type column on FILESTREAM tablesIf you intend to serve the FILESTREAM data from a web application, it may be beneficialto store the appropriate MIME type on each row. When the web application serves theFILESTREAM data to the client, it can set the appropriate content type in the HTTPResponse header. Based on the content type, the client browser/application can open thefile using the appropriate application.In most cases, if you have a column that stores the file extension or file type, you canbuild the content type (such as "application/msword" for Word documents, "image/jpeg"for jpeg images, and so on). But if you deal with a large number of different file types, itmay be a good idea to store the MIME type of the BLOB value stored in each row.Add a timestamp or date column to track lastmodified date for cache controlIf you are serving FILESTREAM data from web applications, caching is an importantfactor to consider. When a client browser and web server communicate over HTTPprotocol, they exchange some caching information. The client tells the server about thecopy of the file locally available (local cache) and the server instructs the client to use thelocal copy or to download a new copy of the file.If the web server tells the client that the file on the server is the same as the copy thatthe client has in the local cache, the file is not downloaded again. Instead, the clientbrowser serves it from the local cache. This significantly enhances the performance ofweb applications.Caching information is exchanged between the client browser/application and web serverusing the HTTP cache headers. The client browser sends the time stamp or etag ofthe file it has in the local cache. After examining the cache header information of the454

Chapter 12: Planning, Configuration and Best Practicesincoming request, the web server tells the client whether the server has a newer versionof the file or not. The file needs to be downloaded only if the server has a newer versionor if the cache is expired.In the case of a FILESTREAM application, the data is stored and managed within thedatabase. The web server does not know about the modified status of the file. As a result,in most cases, the file is downloaded every time a request comes in. This adds considerableoverhead to the application, the database, and the web server.Having a time stamp or date column on the table that keeps track of modificationsdone on the FILESTREAM value will help you to implement cache control in yourFILESTREAM web handler. When a request comes in from a client, you can examine thecache control headers to see whether the file on the server is newer than the file on theclient. If the file has not been modified since the client downloaded it, your handler cansend back an HTTP 304 response which tells the client to use its local copy of the file.Use a computed column to retrieve the file sizeWhile working with FILESTREAM databases, you might need to know the size of theBLOB value stored in the FILESTREAM column. To avoid calculating the length of eachBLOB value every time you need to query, you might decide to create a column andupdate it at the time of writing the FILESTREAM value.An easier option is to create a PERSISTED computed column that stores the length of thefile. Listing 12-6 shows how to add a computed column that stores the size of the BLOBvalue stored in a FILESTREAM column.ALTER TABLE ItemsADD FileSize AS DATALENGTH(ItemImage) PERSISTED;Listing 12-6:Adding a computed column to store the size of the BLOB value in a FILESTREAM column.455

Chapter 12: Planning, Configuration and Best PracticesAs soon as the computed column is added, the sizes of the existing FILESTREAM valuesare automatically calculated and stored in the column. Whenever new FILESTREAMvalues are inserted or existing values are updated, the size is automatically recalculatedand the computed column is updated. Listing 12-7 shows an efficient way to retrieve thefile size from the computed column.SELECTItemNumber,ItemDescription,END AS FileType,FileSizeFROM Items/*ItemNumber ItemDescription FileSize---------- --------------- --------MM2001 Dell Laptop 2290MS1002 Microsoft Mouse 1753MS1005 TVS Key Board 1930MS1007 XBOX 360 28286*/Listing 12-7:Retrieving the file size from the computed column.Keep a default constraint on FILESTREAM columnsWe have already learnt, in Chapter 2, that a FILESTREAM table is populated using atwo-step process. First, the relational data is inserted using a T-SQL query or storedprocedure, then the FILESTREAM column is updated using Win32 or the Managed API.To be able to write the FILESTREAM data using Win32 or the Managed API, theFILESTREAM cell must not be NULL. So the T-SQL query that inserts the relationaldata must insert a dummy value into the FILESTREAM column before the column canbe updated using the FILESTREAM API. If you forget to insert this dummy value intothe FILESTREAM column and try to update it using the streaming APIs, the operationwill fail.456

Chapter 12: Planning, Configuration and Best PracticesIt is recommended in most cases that you add a default constraint to yourFILESTREAM columns so that, when a new row is inserted, the FILESTREAM column willbe populated with a non-NULL value. Listing 12-8 demonstrates how to add a defaultconstraint to a FILESTREAM column.ALTER TABLE ItemsADD CONSTRAINT ItemImage_Default DEFAULT 0xFOR ItemImageListing 12-8:Adding a default constraint to a FILESTREAM column.The default value for the ItemImage column is 0x, which indicates a zero-lengthbinary value. This will effectively create a zero-length (empty) file in the FILESTREAMdata container.SummaryCorrect planning and configuration are necessary to get the best performance from yourFILESTREAM-enabled database. We examined a number of configuration settings thatwill significantly boost FILESTREAM performance.In addition to configuring the disks with the correct RAID level, disabling short file namescan help to improve performance considerably. It is recommended that you configureNTFS cluster size from 4 KB (default) to 64 KB if you do not intend to use NTFS filecompression, and disable indexing and the last access attribute on the NTFS volumewhere you plan to store the FILESTREAM data.If you would like to configure your antivirus software to scan the FILESTREAM data files,make sure that you configure it to only quarantine the infected files (not to delete them).If the antivirus software deletes FILESTREAM data files, the resulting database corruptionmay be harder to resolve.457

Chapter 12: Planning, Configuration and Best PracticesRemote access to the FILESTREAM data uses SMB protocol and therefore the SMB ports(445 and 139) must be enabled on the firewall.In most cases, it will be helpful to keep some additional columns, such as the fileextension, MIME content type, and file size on the FILESTREAM tables, depending uponthe type and usage pattern of your application.458

Appendix A: Configuring FILESTREAMon a SQL Server InstanceIn this appendix, we provide detailed instructions for how to enable the FILESTREAMfeature, and how to disable it if necessary.You can enable FILESTREAM while you are installing a SQL Server 2008 instance,either using the installation wizard, or unattended. Alternatively, you can change theconfiguration settings on an existing SQL Server instance to enable the feature.As described in Chapter 1, the process requires both Windows Administrator-levelactions (to enable the feature) and SQL Server Administrator-level actions (to set up theFILESTREAM access level). If you enable FILESTREAM as part of the installation, theaccess level is automatically set up based on the settings you select during installation.However, if you enable FILESTREAM post-installation, you will have to configure it intwo places: in the SQL Server Configuration Manager to enable the feature, and in theSQL Server instance properties to set the access level.In most cases, you might decide to enable FILESTREAM as part of the installationprocess if you know for sure that the feature is needed. If you do not know at the timeof installation whether a particular feature is needed or not, then it is a better choice toinstall and configure just those features that are needed, and enable other features onlywhen required.It is important to note that a service restart is not needed to enable the FILESTREAMfeature. So there is no harm in skipping the FILESTREAM configuration step duringinstallation and enabling it only when needed. However, fully disabling FILESTREAMwill require a service restart to take effect. Therefore, it's better not to enable it duringinstallation if you do not really need it.459

Appendix A: Configuring FILESTREAM on a SQL Server InstanceIn this appendix, we will examine the actions below.1. Understanding FILESTREAM configuration and access levels.2. Enabling and configuring FILESTREAM during installation.3. Enabling and configuring FILESTREAM after installation.a. Configuring FILESTREAM at the Windows level.b. Configuring FILESTREAM at the SQL Server instance level:i. using SSMSii. using T-SQL.4. Verifying FILESTREAM configuration.a. Verifying Windows-level FILESTREAM configuration.b. Verifying SQL Server instance-level FILESTREAM configuration.c. Using SSMS to check that FILESTREAM is enabled.d. Using T-SQL to check that FILESTREAM is enabled.5. Disabling the FILESTREAM feature.6. Advanced installation and configuration options.a. Unattended SQL Server installation and FILESTREAM configuration.b. Enabling and configuring FILESTREAM from the command line.c. Configuring FILESTREAM access level from the command line.d. Setting up FILESTREAM on a failover cluster.460

Appendix A: Configuring FILESTREAM on a SQL Server InstanceUnderstanding FILESTREAM Configuration andAccess LevelsAs FILESTREAM is a hybrid feature which deals with relational data that resides indatabase files, and BLOB data that resides in the file system folders, it has to be configuredat the SQL Server instance level as well as the Windows operating system level. The firststep is to configure it at the Windows level, which requires Windows Administrator privileges.The second is to configure it at the SQL Server instance level, which requires SQLServer sysadmin privileges.This configuration process has been designed as a two-step process, maybe becauseof the way roles and permissions are distributed among people who manage computers,networks, and SQL Server instances in the "real world." In most environments,Windows/Network administrators manage and administer the operating system andfile system-level configurations, and database administrators manage SQL Serverinstance-level configurations.If FILESTREAM is configured during the SQL Server installation, both parts of theprocess will be carried out during the installation process. If FILESTREAM is beingconfigured on an existing instance, it becomes a two-step process, first enabling thefeature, using the SQL Server Configuration Manager, and then setting the access level,using SSMS or T-SQL.461

Appendix A: Configuring FILESTREAM on a SQL Server InstanceFILESTREAM access levelsThe FILESTREAM feature can be configured to allow three different access levels.T-SQL accessThis is the lowest FILESTREAM access level. When FILESTREAM is enabled at this level,you can create FILESTREAM-enabled databases and read/write FILESTREAM data onlyusing T-SQL.When FILESTREAM data is enabled for T-SQL access, a local or remote application thathas a valid SQL Server connection open can access the FILESTREAM data just like anyother relational data in the database.Streaming access (local)When the FILESTREAM feature is enabled for I/O streaming access, FILESTREAM datacan be accessed using the Win32 APIs as well as .NET Managed classes.Note that at this level streaming access is enabled only to applications running on thesame computer as the SQL Server instance. It is also important to understand thatFILESTREAM streaming access is allowed only when using Windows authentication toconnect to the database instance.Streaming access (remote)When the FILESTREAM feature is enabled for remote streaming access, FILESTREAMdata can be accessed from client applications running on a different computer throughthe streaming APIs.462

Appendix A: Configuring FILESTREAM on a SQL Server InstanceFILESTREAM access level configurationAs mentioned earlier, the FILESTREAM feature has to be enabled at the operatingsystem level as well as SQL Server instance level. FILESTREAM configuration at theWindows level is done using SQL Server Configuration Manager. FILESTREAMconfiguration at the SQL Server instance level can be administered either through theSSMS GUI dialog or through T-SQL. The desired FILESTREAM access level needs to beconfigured at both places.T-SQL access is the lowest FILESTREAM access level, followed by local streaming.Remote streaming is the highest FILESTREAM access level. If the Windows Administratorenables only T-SQL access and the SQL Server administrator enables full access, onlyT-SQL access will be permitted. Similarly, if the Windows Administrator enables remotestreaming access and the SQL Server Administrator enables T-SQL access, then onlyT-SQL access will be allowed.Local versus remote accessLocal streaming access and remote streaming access has been a topic of great confusionto many people. Questions such as "Can a user accessing my website read FILESTREAMdata when remote streaming access is enabled?" have appeared several times on forums.So let us try to understand them clearly.Let us look at an example of a network/application configuration, and see how differentFILESTREAM access levels affect different use cases.463

Appendix A: Configuring FILESTREAM on a SQL Server InstanceFigure A-1:Diagram demonstrating FILESTREAM local/remote configuration.• SQL Server is installed on Server 1 and hosts a FILESTREAM-enabled database.• Web Application 1 is installed on Server 1. The web application connects to SQL Serverusing SQL Server authentication.• Web Application 2 is installed on Server 1. The web application connects to SQL Serverusing Windows authentication.• Web Application 3 is configured on Server 2. The web application connects to SQLServer using Windows authentication.• Desktop Application 1 is running on a computer at a remote location. The desktopapplication connects to the FILESTREAM-enabled SQL Server instance on Server 1over the Internet using SQL Server authentication.• Desktop Application 2 is running on a computer on the same network as the server.The desktop application connects to the FILESTREAM-enabled SQL Server instanceon Server 1 over the LAN using Windows authentication.464

Appendix A: Configuring FILESTREAM on a SQL Server InstanceLet's now see how each of the above applications behave under different FILESTREAMaccess levels.1. T-SQL Access is enabledBecause only T-SQL access is enabled:a. Web Applications 1, 2, and 3 can access FILESTREAM data through T-SQL.b. Desktop Applications 1 and 2 can access FILESTREAM data through T-SQL.c. None of the applications can access the FILESTREAM data throughstreaming APIs.2. Streaming Local Access is enableda. Web Application 1 can access FILESTREAM data only through T-SQL, becausestreaming access is only possible when using Windows authentication.b. Web Application 2 can access FILESTREAM data through T-SQL andstreaming APIs.c. Web Application 3 can access FILESTREAM data through T-SQL only, because it isnot running on the same server as the SQL Server instance hosting the database.d. Desktop Applications 1 and 2 can access FILESTREAM data through T-SQL only.i. Desktop Application 1 cannot use streaming access because it uses SQL Serverauthentication, and because the necessary ports are blocked on the firewall.ii. Desktop Application 2 cannot use streaming access because it runs on acomputer other than the SQL Server.3. Streaming Remote Access is enableda. Web Application 1 can access FILESTREAM data only through T-SQL, becausestreaming access is only possible when using Windows authentication.b. Web Application 2 can access FILESTREAM data through T-SQL andstreaming APIs.c. Web Application 3 can access FILESTREAM data through T-SQL andstreaming APIs.465

Appendix A: Configuring FILESTREAM on a SQL Server Instanced. Desktop Application 1 can access FILESTREAM data through T-SQL only, becauseit uses SQL Server authentication and because the necessary ports are blocked onthe firewall.e. Desktop Application 2 can access FILESTREAM data through T-SQL andstreaming APIs.Enabling and Configuring FILESTREAM DuringInstallationInstalling and configuring FILESTREAM during SQL Server installation is the easiestinstallation option because the entire configuration can be completed in a single step.All the desired options can be selected during the Database Engine Configuration stepduring installation.Figure A-2:Enabling FILESTREAM in the Database Engine Configuration installation step.We'll briefly examine the different configuration options available on this page.466

Appendix A: Configuring FILESTREAM on a SQL Server InstanceEnable FILESTREAM for Transact-SQL accessThis is the primary level FILESTREAM configuration setting and it must be selected inorder to enable the SQL Server instance to use the FILESTREAM capabilities. When thisbox is checked, the FILESTREAM feature will be enabled at the Windows and SQL Serverinstance levels. We can create, restore or attach FILESTREAM-enabled databases whenthe installation finishes, but we can only use T-SQL to read and write FILESTREAM data.If the configuration is enabled, the installation process will automatically setFILESTREAM access level on the SQL Server instance to Transact-SQL Access Enabled.Enable FILESTREAM for file I/O streaming accessWhen this configuration option is enabled, our SQL Server instance will allow I/Ostreaming access to the FILESTREAM data originating from the machine where theinstance is installed. When I/O streaming is enabled, FILESTREAM data can be accessedusing the Win32 API or .NET managed classes. Accessing the FILESTREAM data throughthe FILESTREAM API is much more efficient than T-SQL access and is always the recommendedmethod.When enabling FILESTREAM for I/O streaming access, we must specify a Windows sharename. This Windows file share is not mapped to a directory on the system; it is simplya unique identifier that SQL Server uses to enable streaming access to the FILESTREAMdata. This is internally (logically) linked to the actual file location. When we enable Win32access to the FILESTREAM data, SQL Server installs and enables a Windows file systemfilter driver component. This component is responsible for doing the appropriate translationfrom the logical path to the file path as configured through FILESTREAM filegrouplocation.If the configuration is enabled, the installation process will automatically setFILESTREAM access level on the SQL Server instance to Full access enabled.467

Appendix A: Configuring FILESTREAM on a SQL Server InstanceAllow remote clients to have streaming access toFILESTREAM dataThis option provides I/O streaming access to the FILESTREAM data from applicationsrunning on a different computer. Without this option, I/O streaming access tothe FILESTREAM data will only be possible from an application that runs on the samecomputer as the SQL Server instance.If the configuration is enabled, the installation process will automatically setFILESTREAM access level on the SQL Server instance to Full access enabled. Note thatthis is the same FILESTREAM access level we saw in the previous section when remotestreaming access was not enabled. The fact is that there is no way to differentiate betweenthe two settings at the instance level. We can verify these values by running the followingT-SQL code after the SQL Server instance is installed.SELECTSERVERPROPERTY('FilestreamEffectiveLevel') AS EffectiveValue,SERVERPROPERTY('FilestreamConfiguredLevel') AS WindowsConfiguredValueListing A-1:Checking FILESTREAM access level.If remote I/O streaming is enabled, the server property value will be 3. If I/O streaming isenabled without remote access, it will be 2. These configuration settings are explained indetail later in this chapter.468

Appendix A: Configuring FILESTREAM on a SQL Server InstanceEnabling and Configuring FILESTREAM AfterInstallationIf FILESTREAM is not enabled during the SQL Server installation process, it is quitepossible to perform the configuration post-installation. The most common reasons fornot enabling FILESTREAM during installation are:• the feature is not (currently) required in the environment• the DBA or server administrator does not know whether FILESTREAM is needed ornot, so it is better to keep it disabled and enable only when needed.As discussed earlier, enabling and configuring FILESTREAM after installation requirestwo separate configuration actions, and we'll examine each in detail.Configuring FILESTREAM at the Windows levelWindows level FILESTREAM feature configuration is administered from theSQL Server Configuration Manager. Follow the steps below to open the FILESTREAMconfiguration dialog.1. Select SQL Server Configuration Manager from the Configuration Tools submenu.Figure A-3:Starting SQL Server Configuration Manager.469

Appendix A: Configuring FILESTREAM on a SQL Server Instance2. From SQL Server Configuration Manager, right-click on the SQL Server Service andselect Properties from the context menu.Figure A-4:Opening SQL Server Service Properties.3. This will open the FILESTREAM configuration page which allows you to enable theFILESTREAM feature and select the desired access level.Figure A-5:Enabling the FILESTREAM feature.4. The options available on this page are the same as what is available in the installationdialog. See the section above, where we discussed these options in detail.470

Appendix A: Configuring FILESTREAM on a SQL Server InstanceConfiguring FILESTREAM at the SQL Serverinstance levelFILESTREAM can be configured at the SQL Server instance level, either using the SSMSGUI dialog or through T-SQL.Using SSMS to set FILESTREAM access levelSQL Server instance-level FILESTREAM configuration settings can be managed from theproperty page of the SQL Server instance.1. In the Object Explorer window, right-click on the SQL Server instance and selectProperties from the context menu.Figure A-6:Opening the SQL Server Instance Property Page.471

Appendix A: Configuring FILESTREAM on a SQL Server Instance2. Click on the Advanced tab on the left-hand-side control panel.3. Look for the Filestream section on the top right-hand side of the property page.Figure A-7:Configuring the Filestream Access Level.The Filestream Access Level option within the configuration dialog allows you to selectone of the following options:• Disabled – disables FILESTREAM access for the SQL Server instance• Transact-SQL access enabled – enables FILESTREAM for T-SQL access• Full access enabled – enables FILESTREAM for T-SQL and file I/O streaming access.472

Appendix A: Configuring FILESTREAM on a SQL Server InstanceMaking changes to the SQL Server FILESTREAM access level will require a service restartin some cases. Whenever you make changes to the FILESTREAM access level, SSMS willshow a warning message indicating that the changes will not take effect until SQL Serveris restarted (Figure A-8).Figure A-8:Restart SQL Server message.However, we can safely ignore this message when we select an "enable" access option;a service restart is not required when enabling FILESTREAM access on a SQL Serverinstance. If we completely disable access by selecting Disabled, a restart is required beforethe change will be effective.Figure A-9:Configured values and Running values.473

Appendix A: Configuring FILESTREAM on a SQL Server InstanceNotice that the property page has two views: Configured values and Running values, asshown in Figure A-9. The Running values view shows the values currently being usedby SQL Server. If an access level change requires a SQL Server restart, as in the case ofDisabled, you will see the new value in Configured values and the old value in Runningvalues until the instance is restarted.Using T-SQL to set the FILESTREAM access levelWe can use T-SQL to set the FILESTREAM access level. For example, run the T-SQL codein Listing A-2 to change the access level to Full Access Enabled.EXEC sp_configure filestream_access_level, 2GORECONFIGUREGOListing A-2:Using T-SQL to change the FILESTREAM access level.The second argument passed to sp_configure specifies the FILESTREAM access level.• 0 - Disables FILESTREAM access.• 1 - Enables FILESTREAM for T-SQL access.• 2 - Enables FILESTREAM for full access (T-SQL and local I/O streaming). This optionis not valid on clustered installations.When you change the FILESTREAM access level using T-SQL, a message to restart SQLServer service is displayed only if you choose the "disable" FILESTREAM access level(parameter value 0).474

Appendix A: Configuring FILESTREAM on a SQL Server InstanceVerifying FILESTREAM ConfigurationWe have seen various FILESTREAM configuration options and learned how to enable andconfigure FILESTREAM to the required access level. Let's verify that all the FILESTREAMsettings are correctly configured before we start using them.Verifying Windows-level FILESTREAM configurationsettingsWe know that the Windows-level FILESTREAM configuration is done from SQL ServerConfiguration Manager. So the first step is to check the FILESTREAM tab of the propertypage of the SQL Server service within the Configuration Manager.Make sure that the FILESTREAM access is enabled to the desired level on the configurationpage. If I/O streaming access to the FILESTREAM data is required, make sure thatthe FILESTREAM share name is configured in the configuration dialog.Figure A-10:Verifying FILESTREAM configuration.475

Appendix A: Configuring FILESTREAM on a SQL Server InstanceIt is also important to verify that a Windows share with the same name exists to ensurethat the windows share is correctly configured.Figure A-11:Verifying FILESTREAM Windows share.Please note that the Windows share does not point to a physical disk location; it pointsto a virtual identifier which allows SQL Server to map a SQL Server instance with itsassociated FILESTREAM data container. Attempting to open the Windows share willresult in an error.Figure A-12:The FILESTREAM Windows share is not accessible using Windows Explorer.Verifying the Windows-level configuration can also be done using T-SQL, asdiscussed next.476

Appendix A: Configuring FILESTREAM on a SQL Server InstanceVerifying SQL Server Instance-level FILESTREAMconfiguration settingsTo ensure that the SQL Server instance is enabled and correctly set up for FILESTREAMstorage, we can use SSMS or T-SQL.Using SSMS to check that FILESTREAM is enabledYou can verify that FILESTREAM is correctly configured by looking at the SQL Serverinstance properties in SSMS. Select the Advanced page, then select the Running valuesview (Figure A-12).Figure A-13:The Running values view shows the active configuration settings.Then look at the Filestream Access Level as shown in Figure A-14.Figure A-14:Checking the Filestream Access Level.477

Appendix A: Configuring FILESTREAM on a SQL Server InstanceThe access level should show either Transact-SQL access enabled or Full access enabled.Note that, in the Running values view, there is no actual drop-down list.The Configured values view will show the Filestream Access Level as configured atthe SQL Server instance level. It is possible for the configured value to show Full accessenabled, while the FILESTREAM feature can actually be disabled at the Windows level.Using T-SQL to check that FILESTREAM is enabledThe T-SQL code in Listing A-3 retrieves the FILESTREAM configuration of the currentSQL Server instance.SELECT SERVERPROPERTY('FilestreamEffectiveLevel') AS FilestreamEffectiveLevel/*FilestreamEffectiveLevel---------------------------1*/Listing A-3:Query to retrieve the FILESTREAM configuration level of a SQL Server instance.The possible return values of the above query are:• 0 - FILESTREAM is disabled• 1 - FILESTREAM is enabled for T-SQL access• 2 - FILESTREAM is enabled for T-SQL and local streaming access only• 3 - FILESTREAM is enabled for T-SQL and local and remote streaming access.At the time of writing, neither the MSDN documentation nor Books Online mentionconfiguration value 3. Configuration values 2 and 3 are used to differentiate betweenlocal and remote streaming access.478

Appendix A: Configuring FILESTREAM on a SQL Server InstanceDisabling the FILESTREAM FeatureTo disable the FILESTREAM feature at the Windows level, clear the Enable FILESTREAMfor Transact-SQL access option in the Properties page of the SQL Server instance in SQLServer Configuration Manager.Figure A-15:Disabling the FILESTREAM feature.A message will be displayed to inform you that a service restart is required before thechanges can take effect, as shown in Figure A-16.Figure A-16:Restart SQL Server instance message following disable.479

Appendix A: Configuring FILESTREAM on a SQL Server InstanceUntil the instance is restarted, the FILESTREAM-enabled databases can be queried usingT-SQL (both relational and FILESTREAM data), but no longer using the streaming APIs.Access using the streaming APIs is disabled immediately because the file share is removed.The restart is required because SQL Server has to take offline any FILESTREAM-enableddatabases on the instance. Following the service restart, any FILESTREAM-enableddatabases will not be accessible. You will be able to see the databases in the SSMS ObjectExplorer (Figure A-16), but you will not be able to establish a connection to them.Figure A-17:Inaccessible FILESTREAM-enabled database in Object Explorer.If we now try to run a query on a FILESTREAM-enabled database, we will get an errorsimilar to the one shown in Listing A-4.SELECT *FROM NorthPole.dbo.items/*Msg 945, Level 14, State 2, Line 1Database 'NorthPole' cannot be opened due to inaccessiblefiles or insufficient memory or disk space.See the SQL Server errorlog for details.*/Listing A-4:Error message showing FILESTREAM-enabled database is inaccessible.480

Appendix A: Configuring FILESTREAM on a SQL Server InstanceAdvanced Installation and Configuration OptionsWe have examined the basic and the most commonly used FILESTREAM installationand configuration processes, and it is time for us to see some of the advanced configurationoptions.Unattended SQL Server installation andFILESTREAM configurationWhen installing SQL Server 2008 or later in unattended mode, we can use the/FILESTREAMLEVEL and /FILESTREAMSHARENAME configuration options to enablethe FILESTREAM feature.• /FILESTREAMLEVEL – This setting is optional. It specifies the required access level forthe FILESTREAM feature. The following values are supported:• 0 – Disable FILESTREAM for the current instance. This is the default value.• 1 – Enable FILESTREAM for T-SQL access.• 2 – Enable FILESTREAM for T-SQL and I/O streaming access. This option is notvalid for clusters.• 3 – Allow remote clients streaming access to FILESTREAM data.• /FILESTREAMSHARENAME – This setting is required if /FILESTREAMLEVEL is 2 or 3.It specifies the Windows file share name to be used for storing the FILESTREAM data.The example in Listing A-5 installs a new instance of SQL Server named KATMAIUA withFILESTREAM enabled, and sets the FILESTREAM share name to KATMAIUA.481

Appendix A: Configuring FILESTREAM on a SQL Server InstanceSetup.exe/q/ACTION=Install/FEATURES=SQL/INSTANCENAME=KATMAIUA/SQLSVCACCOUNT="DOMAIN\User"/SQLSVCPASSWORD="mypwd"/SQLSYSADMINACCOUNTS="DOMAIN\User"/AGTSVCACCOUNT="DOMAIN\User"/AGTSVCPASSWORD="mypwd"/IAcceptSQLServerLicenseTerms/FILESTREAMLEVEL="3"/FILESTREAMSHARENAME="KATMAIUA"Listing A-5:Example setup options for enabling FILESTREAM in unattended mode./IAcceptSQLServerLicenseTerms is a new parameter introduced in SQL Server2008 R2. It is a mandatory parameter for unattended installations and the setup will fail ifthis value is not supplied.Figure A-18 shows an example of the command line to enable FILESTREAM inunattended mode.Figure A-18:Example of enabling FILESTREAM in unattended mode.Once the installation script completes, we can verify the configuration settings of thenewly installed SQL Server instance using the methods explained earlier in this chapter.482

Appendix A: Configuring FILESTREAM on a SQL Server InstanceEnabling and configuring FILESTREAM from thecommand lineSometimes, we need to enable and configure FILESTREAM using the command line, forexample as part of the installation or upgrade of an application that intends to use theFILESTREAM features. It is quite easy to set the FILESTREAM access level by using SQLServer command line tools like sqlcmd.exe.However, programmatically enabling the FILESTREAM feature is more complicated usinga scripting language like VBScript. Luckily, a short VBScript is available for download atcodeplex.com: http://brurl.com/fs26.Listing A-6 shows the syntax for running that script.cscript filestream_enable.vbs[/Machine:MachineName][/Instance:InstanceName][/Level:level][/Share:ShareName]Listing A-6:Syntax for running VBScript program to enable FILESTREAM.Some debug messages are displayed; we can safely ignore these. Figure A-19 shows anexample command line for enabling FILESTREAM programmatically.filestream_enable.vbs /Machine:localhost /Instance:MSSQLSERVER /Level:3 /Share:MSSQLSERVEROnce the script execution has completed, FILESTREAM is enabled at the Windows level,and we can then configure the access level on the SQL Server instance.483

Appendix A: Configuring FILESTREAM on a SQL Server InstanceAs of this writing, the CodePlex script mentioned does not work withSQL Server 2012. If you would like to run this script on SQL Server 2012, youneed to edit the script and change ComputerManagement10:FilestreamSettings toComputerManagement11:FilestreamSettings on lines number 37 and 110.Configuring FILESTREAM access level from thecommand lineWe can also configure the FILESTREAM access level from the command line usingsqlcmd.exe. It allows us to run a T-SQL script on the specified SQL Server instance. As afirst step, create a T-SQL script file containing the following code.EXEC sp_configure filestream_access_level, 2GORECONFIGUREGOListing A-7:Creating a script file to configure FILESTREAM access level from the command line.In this example, we are trying to set the FILESTREAM access level to Full access. See thereference earlier in this chapter for a detailed explanation of the values that can be passedas the second parameter into the system stored procedure sp_configure.Save the script file as c:\temp\set_filestream_access_level.sql. This file can bepassed as the query input to sqlcmd.exe. The following example demonstrates it.sqlcmd -S . -E -I "c:\temp\set_filestream_access_level.sql"When running this command, the T-SQL code we wrote within the script file willbe executed against the specified SQL Server instance (in our case it is the defaultSQL Server instance on the local computer) and the output will be displayed in thecommand window.484

Appendix A: Configuring FILESTREAM on a SQL Server InstanceFigure A-19:Configuring FILESTREAM access level from command line.We can use the various methods discussed earlier to verify that the configuration changestook place correctly on the specified SQL Server instance.Setting up FILESTREAM on a failover clusterWhen setting up the FILESTREAM feature on a failover cluster, we need to make surethat all the nodes are set up with the same configuration. The following steps arerecommended for enabling and configuring FILESTREAM feature on a failover cluster.1. Enable the FILESTREAM feature on the primary node using SQL ServerConfiguration Manager.2. If I/O streaming access to the FILESTREAM data is required, enable it and specify thedesired Windows share name or leave the default.3. If remote access to the FILESTREAM data is required, enable the option. This actionwill also create a cluster file share resource.4. Enable the FILESTREAM feature on each passive node using SQL ServerConfiguration Manager. The I/O streaming option on each of the passivenodes should be configured in the same way as on the primary node. All thepassive nodes must use the same Windows share name as the primary node.485

Appendix A: Configuring FILESTREAM on a SQL Server Instance5. After FILESTREAM is enabled and configured using SQL Server ConfigurationManager, configure the desired FILESTREAM access level on the primary node andall the passive nodes. It is important to make sure that you set the same FILESTREAMaccess level on all the nodes.It is also important to configure the FILESTREAM data container(s) on a shared disk. TheFILESTREAM data container must be accessible from all nodes in the cluster, hence therequirement to place it on a shared disk.SummaryFILESTREAM can be enabled during installing of the SQL Server instance by selecting theappropriate options in the wizard. However, there may be times when you need to enableit on an existing SQL Server instance.Enabling FILESTREAM post-installation is a two-step process. FILESTREAM must beenabled at the Windows Administrator level using SQL Server Configuration Manager,and it must be enabled at the SQL Server Administrator level either using SSMS or thesp_configure system stored procedure.FILESTREAM can also be enabled as part of an unattended SQL Server installation byusing the /FILESTREAMLEVEL and /FILESTREAMSHARENAME setup parameters.Additionally, a VBScript program, available at codeplex.com, can be used to programmaticallyenable the FILESTREAM feature at the Windows Administrator level.If we disable FILESTREAM on a SQL Server instance that has one or more FILESTREAMenableddatabases, those databases will not be accessible after restarting the SQLServer service. The data on the database is unaffected, and will be available as normal ifFILESTREAM is re-enabled and the SQL Server service is restarted. The message that SQLServer Configuration Manager provides is quite misleading. It says that, after disablingFILESTREAM, data in the FILESTREAM columns will not be available. However, theentire database will be inaccessible instead of just the FILESTREAM columns.486

About Red GateYou know those annoying jobs that spoil your daywhenever they come up?Writing out scripts to update your production database,or trawling through code to see why it’s running so slow.Red Gate makes tools to fix those problems for you.Many of our tools are now industry standards. In fact,at the last count, we had over 650,000 users.But we try to go beyond that. We want to support youand the rest of the SQL Server and .NET communities inany way we can.First, we publish a library of free books on .NET andSQL Server. You’re reading one of them now. You canget dozens more from www.red-gate.com/booksSecond, we commission and edit rigorously accurate articles from experts on thefront line of application and database development. We publish them in our onlinejournal Simple Talk, which is read by millions of technology professionals each year.On SQL Server Central, we host the largest SQL Server communityin the world. As well as lively forums, it puts out a daily dose ofdistilled SQL Server know-how through its newsletter, which nowhas nearly a million subscribers (and counting).Third, we organize and sponsor events (about 50,000 of youcame to them last year), including SQL in the City, a free eventfor SQL Server users in the US and Europe.So, if you want more free books and articles, or to getsponsorship, or to try some tools that make your life easier,then head over to www.red-gate.com

The Art of SQL Server FILESTREAM - Red Gate Software

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?