Understanding Image Load Files

It’s been years, I know, but not much has changed in the world of image load files.  So, on to the fun stuff!  Where metadata load files can have different delimiters, image load files can have all sorts of different formats.  Let’s take a look at two different and pretty common types of image load files:  OPT and LFP.

OPT (Opticon) Files

Opticon files are used to import images into review applications like Relativity, the Concordance Image Viewer (fka Opticon), and others.  Here’s an example of an OPT file:

OptFileOverview

What does all of this mean?  What can you tell about the delivery you’ve received by these rows of text?  Let’s analyze the first row:

  • PRODABC00000001, – This is the image key.  This is the unique identifier for (in this OPT) the image.  This will be the information that the review application uses to match the image(s) up with the reference to the document.  Usually from this, you can tell what the Bates or Control numbers are for the documents you’ll load to the database.
  • PRODABC001, – This is the volume name associated with this particular set of documents.
    • If you have several loads that would be easier to import at one time, you can combine the various OPT files into one file–there are many free text editors that can help you accomplish this task.
    • You do not necessarily need a volume name, but it can help you identify the load file from which the images originally came, should you ever need to reference that information again.
  • \ABC001\IMAGES\IMAGES001\PRODABC00000001.TIF,
    • This is the path to the location of the image file to be loaded.
    • It is important that the path to the file is correct, so that the review application can find the file.
    • In this example, there is an entry for each page (image) that was provided.
    • From this, you can tell what type of file will be imported (TIF, JPG, etc.).
    • Something interesting to note here is that the name of the file does not matter.  The unique identifier (image key) mentioned above should match with the unique identifier in the review application.  As long as this path leads the review application to the correct file, the file name could be “banana.tif.”
  • Y, – This tells the review application and you that this is the first page of the document.
    • When this isn’t populated, you are most likely looking at a line that indicates a page within the document.
    • If you don’t see any instances of this being blank, you are either seeing a single page document or the load file you’ve received points to multi-page TIF images.
  • , (the blank section after the “Y”) – This is populated to indicate a box number.  I have never seen this populated, but it could happen.
  • , (the blank section after the last blank section) – This is populated to indicate a folder.  Again, I have never seen this populated.
  • 2 – This indicates the number of pages within the document. If there is no number listed, you are likely looking at a page within the document.

From the information in the OPT file, you can tell if you received single- or multi-page images, how many images you received, how they should be named, and how they will be referenced by the review application.  You can also tell how many images will be loaded, based on the number of rows included in the OPT file.  Not too bad for getting an understanding of what’s been provided!

LFP Files

This format is brought to us by IPRO and the official information about it can be found here.

Here’s an example of an LFP:

LFPFileOverview

Break it down!

  • IM, – This tells the review application to prepare for an incoming image.  As of a 2004 publication (which apparently no longer exists), there were 23 other possibilities of information that could appear here.
  • PRODABC00000001, – This is the image key.  This is the unique identifier for (in this LFP) the image.  Notice that there is an entry for each page.  So, by looking at this file, you can tell that you received single-page TIF images.
  • D, – This tells the review application that this is the first page of the document.  There are different indicators that can fall in this spot:
    • D – Document – Indicates that this is the beginning of the document.  This is the most commonly used indicator.
    • S – Source – I have never seen this used, but I’m sure users of IPRO incorporate this to indicate the beginning of the source of the documents.
    • B – Box – I have rarely seen this used, but it can be used to indicate that the document is the start of a box of documents.
    • F – Folder – I have rarely seen this used, but it can be used to indicate that the document is the start of a folder of documents.
    • C – Child, or the attachment to the document immediately listed before this entry.  This is the second most commonly used indicator.  There are applications that will use this information to indicate an attachment.
    • <Blank> – This indicates a page within the document.  Sometimes there is a space here, sometimes not.  A space within this location will not affect importing the pages of a document.
  • 0, – This isn’t explained in the referenced document.  If you know, please comment!
  • @PRODABC001; – The volume name for the set of images being loaded.  Just like with the OPT file, this doesn’t necessarily matter, but can be helpful for general knowledge and organization.  You can also combine LFPs into one large LFP to load images and having the volume information will help you remember where each set of images originated.
  • .\ABC001\Images\Images001; – The path to the folder where the images exist.  This is considered a relative path, since there is no drive letter or network location indicated at the beginning of the path.  Instead there is a period.  So, your load file should exist outside the first folder listed in this path.
  • PRODABC000000001.TIF; – The name of the image as it exists within the folder path.  Just like with the OPT file, the name of the image does not have to match the unique identifier, as long as the path and name of the file are on the same line as the unique identifier.
  • 2 – This number indicates the format of the image being loaded (the referenced document does not say that, but this is my understanding).

From the information in the LFP, you can tell if you received single- or multi-page images, how many images you received, how they should be named, and how they will be referenced by the review application.  Getting all of this information from the file itself helps you QC the images after they have been loaded to the review tool you are using.

There are absolutely other types of image load files that you may receive as part of a delivery, so be sure that you use your review tool’s documentation to understand what format the image load file needs to have in order to load those images.

I hope this was helpful in your quest to understand those files you received, but please comment with any questions you have or if anything I said was unclear.

Happy Loading!

~CAtkins Support

Posted in eDiscovery, Litigation Support, Load Files | Leave a comment

No, You Don’t Have to Read Everything

Source: No, You Don’t Have to Read Everything

Posted in Uncategorized | Leave a comment

eDiscovery Education

brain_resized-dialouge-online2So, the other day I was reading an article about eDiscovery.  I do that a lot…it’s weirdly interesting to me.  While I was reading, I see a line about the embarrassing level of technical knowledge attorneys have.  This isn’t the first time I’ve seen a comment like this.  Each time I see it (or some iteration of it), I cringe.  Why aren’t attorneys complaining about their tech team’s lack of knowledge about the law?  Wouldn’t it make sense for the Litigation Support service provider, software company, or department to have a detailed understanding of the law?

I remember a conversation I was having with an attorney about what needed to be done to get some data ready for production.  They flat out told me that they didn’t care and didn’t want to know–it just needed to be done.  A couple of weeks after that conversation, I got a call from that same attorney telling me that they were required to explain to the judge how they got the data to the point of production.  It was as that moment that the attorney realized two things:  They needed at least that high level of knowledge about how we got to production and that they needed someone who could explain it in detail for them.

My career started with me knowing absolutely nothing about the law, but having a good understanding of technology.  Through the years, I have had to learn about pleadings, responses, issues, production orders, etc.  These pieces of information were definitely helpful in doing my job, but by no means were they required to get the job done.

Of course it’s a good idea for your Lit Support folks to know–at least at a high level–something about the way litigation works.  What are the steps along the way?  Why a case team needs the data that they request and how it will be used for productions and during trial (if it gets that far).  It’s also a good idea for attorneys to have a high level understanding of the steps and time involved in gathering, processing, producing, and presenting data.  Is it really embarrassing that attorneys don’t have an in-depth knowledge of technology?  No, it’s not.

However, if you are interested in learning about eDiscovery, I recommend taking a look at the EDRM Diagram first.  The eDiscovery community contributes to this framework, which puts you directly inside the industry.  It’s a great start, interactive, and contains glossaries to help you learn all of the fun terminology that goes along with this discipline.  There is a ton of information on this site!  And, the more you are involved in eDiscovery, the more you will see this diagram.

Once you have a good understanding of the terminology and phases of electronic discovery, read blogs and white papers.  There are more resources than I can count, but two of my personal favorites are Ball in your Court (Craig Ball) and Bow Tie Law (Josh Gilliland).  These guys are both attorneys that present relevant information in interesting ways.

There’s also the Litigation Support Guru (Amy Bowser-Rollins) who offers classes, a blog, tips, and tests on her site.  She’s been in the industry for quite a while and offers a great approach to understanding litigation support terminology, careers, and so much more.

If these none of these are to your liking, look for white papers on your (or any) service provider’s web site.  You can also do a web search for “eDiscovery blogs” or “ESI blogs” or maybe “eDiscovery training” and you’re sure to find something that leads you down the best path for you.

Most importantly, talk with people in the industry.  Ask questions.  Sit with the people in your Litigation Support department and watch the things they do to get your data prepared for review and production.  Ask more questions.  Work closely with your service providers.  Keep asking questions.  Try some of the steps yourself.  Yep, ask questions again.  Doing this will help you gain a better understanding of why processes take the time that they do and will give you a good idea of the steps that are involved in each phase of the eDiscovery process.

Once you have worked your way through the suggestions above, you should have a good understanding of eDiscovery and if you’re really interested in the world of eDiscovery, you will take it from there.  And, when the judge asks for an explanation of how you got the data to the point of production, you might just be excited to answer!

Happy Learning!

~CAtkins

Posted in eDiscovery, Education, Litigation Support, Technology | Tagged , , | 2 Comments

A complete list of file types that cannot/should not be processed?

If you are a part of the Litigation Support community, you have no doubt been asked for a complete list of file types that cannot or should not be processed.  What was your answer?

My answer has always been:  There is no definitive list of file types that fall within that category.

Here’s why and some tools to help you figure out what you don’t need…

NIST

Some may argue that the NIST’s (National Institute of Standards and Technology) National Software Reference Library (NSRL) is just such a list.  This list contains, “…[information that] will help alleviate much of the effort involved in determining which files are important as evidence on computers or file systems that have been seized as part of criminal investigations. The [Reference Data Set] RDS is a collection of digital signatures of known, traceable software applications.”  This is a great start for getting rid of file types that are unnecessary and known by the NIST, but it is by no means all-encompassing.

So, what can be done?

Before we get into details, let’s first define “processing.”  Some think of processing as ingesting data and extracting metadata and text from that data.  Others see processing as the process of converting a native file into an image (usually a TIFF or a JPG).  Many define processing as a combination of these things, along with performing OCR on data that did not have text to extract.  The EDRM lists processing as encompassing the following:

  • Aims: Perform actions on [Electronically Stored Information] ESI to allow for metadata preservation, itemization, normalization of format, and data reduction via selection for review.
  • Goal: Identify ESI items appropriate for review and production as per project requirements.

A quick word about metadata:

Most documents have at least a bit of metadata (data about the data) that can be extracted, many have “extractable” text, and most can be included within a review database.  This; however, does not mean that you will be able to review the document.  In most instances, if a document cannot be readily opened by an installed application, it’s likely that the same document cannot be converted to an image format.

If you’re having issues working with the file, your opposing counsel will likely encounter issues too and we all know what that means!

Now, back to our originally scheduled discussion…

Case Type

A huge piece of the answer lies within your specific data set and the focus of your case.  Is the focus of your case on intellectual property?  If so, you might need to keep all of those program files around for review and, possibly, production.  Or, is yours an employment case, all about retaliation?  In that instance, you’ll want to push the program/system files aside and focus on the email messages and general documentation.  Ask yourself what type of information needs to be produced to the other side and then make decisions about what files to keep for review.

File Location

One place to start identifying the files you’ll want to keep is in the original file path.  If you have collected the data (in a forensically sound manner) as it was originally kept and had that data inventoried, the original file path information should be readily available and identifiable.  An inventory of your data can be generated by a service provider or after you have made a copy of the original data (again, in a forensically sound manner), using any of a variety of applications that serve the purpose of generating an inventory of files.

Here’s an example:  If your case is an employment matter, files that are in the “…\Program Files\” or the “…\Windows\” folders can usually be put aside.  Windows Address BarThese are all of the files that your computer uses to run programs and to remember what image you have chosen to use as the desktop wallpaper.  It is worth taking a look at the content of these types of folders, just to make sure nothing has been hidden there.  However, most people use folders to stay organized and most people don’t make a habit of navigating to program or system folders to store their every-day files.

If you take a look in the “My Documents” or “Documents” folder on a user’s machine, you’re more likely to find a better crop of documents.  You may see a folder structure that looks something like this:

  • Documents
    • Invoices
      • 2014
        • 01January
        • 02February
        • 03March
      • 2015
        • 01January
        • 02February
        • 03March
    • Presentations
    • Sales Information
    • Staff Information
      • Employee 001
      • Employee 002
      • Employee 003
    • Etc.

In this structure, you’ll likely want to take a pretty close look at the files in the “Staff Information” folder and its sub-folders.  But, don’t ignore the other folders.  Maybe someone incorrectly filed a document or maybe someone intentionally saved a file in an unrelated folder to throw others off of the trail.

File Types

Another quick way to determine what to keep and what to ignore is to grab a list of file extensions.  These are the three or four (sometimes less, sometimes more, sometimes none at all) characters that come after the period in file names.  In the file name “JanuaryInvoices.xlsx,” the “XLSX” is the file extension.  Your computer might be set to hide the extensions of more common file types.  This setting can be changed in Windows.

WARNING:  One file type that will most likely end up on your “do not process” list is the EXE.  This is the common extension of the files that prompt a program to start/run.  It’s best to avoid these guys, if at all possible, unless your case specifically calls for you to produce them.  If that happens, you’ll likely need to review an entire program and not the EXE alone.

TIP:  Those pesky “thumbs.db” files.  You’ve likely seen them and wondered what they are.  According to Microsoft, “[A] Thumbs.db stores graphics, movie, and some document files, then generates a preview of the folder contents using a thumbnail cache.”  These files are not useful to your review efforts and will usually return an error when trying to view as a native or convert to an image.

There are a variety of applications that will extract the file’s extension from the file’s name and place the extension in a field or column.  From there, you can group or tally the list of extensions and find which file types you want to keep.  Just don’t forget about keeping families together!  In other words, you might not want to remove a DLL file if it’s attached to an email message.  If you do, be prepared to explain why there is an attachment referenced by an email, but no attachment to be found in your production set.

If you’re not sure about the software that can be used to open a certain type of file, you can either search for the file extension itself using a search engine or use one of the many sites that contain file extensions and their commonly associated applications.  My go-to site is FILExt, but there are several others out there that contain great information.

Using a compiled list of file extensions can help you knock out hundreds, maybe thousands of unwanted files at once.  Remember to watch out for families!

Content

In the end, file content is what will ultimately be reviewed and produced.  Once you’ve used the NIST list, case-specific information, locations, and extensions to cull down the data, you are left with content.  There are a variety of applications that will organize your data’s content to quickly get to the documents you really need to review.  However, content-based culling is an entirely different post for another day.

Happy Culling!

~CAtkins

Posted in Culling/Filtering, eDiscovery, Litigation Support, Technology | 1 Comment

How a Cookie Changed My Perspective

About 10 years ago, I was a customer service representative for a company that built convey lines and silos for various manufacturing companies. One of our clients was a company that made cookies. Not just any cookies–cookies with a marshmallow on top and covered in dark chocolate. So delicious!

This company had an issue with their convey line and needed a part. It was a small part, but without it, their line was down and they wouldn’t be able to meet their deadline. I talked with the company’s manager, who said that if I could get that part to him by the next day, he would send me some of these wonderful cookies.

So, I pulled every string I had to get him the part. I didn’t do it because I wanted the cookies (not that I didn’t want them, but that wasn’t the reason I worked so hard to get him that part). I did it because it was my job to make sure the customers got what they needed–I was a customer service rep, after all–and the manager was kind, explained his needs, and helped me understand exactly what the consequences would be for his company and the employees if the part wasn’t received.

The part arrived in time and they were back in business. I forgot about the promise of cookies.  I was happy that the client got what they needed and there were other clients waiting for their own parts.

About a week later, I received a box. The box wasn’t small at all and there wasn’t just one box of cookies. It was a case of cookies! I immediately called him to thank him for making good on what he said. He told me that I was the one to be thanked and that I should be sure to wait a little while before I ate any of the cookies, because they were so fresh off of the line that the chocolate needed to bloom to make them taste better. He was right and those cookies were wonderful!

How did all of this change my perspective?  It helped me realize that…

  • My job wasn’t as insignificant as I once thought.  That one small part he needed made a big difference in the manager’s life, the lives of his employees, and the lives of all the people who love those cookies!  Every little thing we do reaches places we would never suspect.
  • His kindness reminded me that there are people in this world that care just as much about their job as I do…those that would go to any lengths to get things done, without getting angry, yelling, or threatening.
  • Simply knowing that the line was working again made me proud.  Those cookies were definitely a bonus.
  • We should remember that we are in this together and no matter how tough or impossible something seems, as long as we have each other, we can make it work.
  • My Granny’s wisdom was as true then as it is now:  If you really put your mind to it, anything is possible!

Since then, I have had many experiences that have reinforced my realizations, and I foresee many more.

~CAtkins

Posted in Business, Perspective, Success | Leave a comment

Can You Sue for Invasion of Privacy if Someone Reposts an Instagram Photo?

Interesting case and good information to keep in mind!

~CAtkins

Bow Tie Law's Blog

basketball-31353_1280If a basketball player posts a public photo to Instagram, and then another basketball player reposts the photo, can the first basketball player sue for Invasion of Privacy, Intentional Infliction of Emotional Distress, Defamation, and General Negligence?

The answer is yes, you can sue, but you will not survive a motion to dismiss. That is the lesson from Binion v. O’Neal, 2015 U.S. Dist. LEXIS 43456, 1 (E.D. Mich. Apr. 2, 2015).

US District Judge Avern Cohn started this opinion in the most logical place: Instagram’s terms of service. The Court quote Instagram’s FAQ’s and privacy statement as follows:

Instagram is a social media website that describes itself as a “fun and quirky way to share your life with friends through a series of pictures.” (FAQ, Instagram.com, https://instagram.com/about/faq/ (last visited Mar. 5, 2015)) Every Instagram user is advised that “[a]ll photos are public by default which means they…

View original post 862 more words

Posted in Uncategorized | Leave a comment