Parsing File Names to Extract Shot Information

Problem

Many studios embed shot/project information within their file structure. Parsing out this information can seem daunting if you’re new to Python.

Solution

Python has many built-in string/filename parsing utilities that make this problem seem somewhat trivial once you’re used to them. Let us assume that each entity we’re looking to parse from the filename consists of a directory (e.g. X:\project_name\sequence_name\shot_name\image_sequence_####.png). In this case, we can use Python’s os.path.split() function to walk through the directory structure:

import os

#Sample input
inFile = r"X:\\project\\sequence\\shot\\image_sequence_####.png"

(head, tail) = os.path.split( inFile )
#tail is now 'image_sequence_####.png' and
#head is now 'X:\\project\\sequence\\shot'

#do it again (on head this time) to go back another level
(head, tail) = os.path.split( head )
#head is now 'X:\\project_name\\sequence'
#tail is now 'shot' - our shot name!
shotName = tail

#keep walking backwards to get more info!
(head, tail) = os.path.split( head )
#head is now 'X:\\project'
#tail is now 'sequence' - our sequence name!
sequenceName = tail

#one more...
(head, tail) = os.path.split( head )
#head is now just 'X:'
#tail is now 'project' - our project name!
projectName = tail

Note that if you have more than one entity name in your directories, you can still use the above approach, but you will need the generic Python’s String version of split(), to divide things up even more.

Let’s say for the sake of example that one of your directories is the sequence name AND shot name, separated by an underscore: "sequence_shot". You can then use split() to split the string by the underscore, which acts as a separator:

#Sample input
directoryName = "sequence_shot"

splitResult = directoryName.split( "_" ) #split on underscores
sequenceName = splitResult[0] #first part of the split is the sequence
shotName = splitResult[1] #second part of the split is the shot

This is obviously extendable to any number of splits (splitResult will have one element for each split it makes), and different separators (simply change the split argument to be your delimiter).

Discussion

It is important to be aware that sometimes, for whatever reason, the paths passed into Draft may not match your expectations. In that event, it is important to know how each of the above code snippets will behave.

The first example should be fairly robust; in the worst-case scenario, 'shotName', 'sequenceName', and 'projectName' will just be blank (or the incorrect value). However, in the second example, if there are not enough results in 'splitResult', you might get an IndexError when trying to get a value you expect to be there. To make this more robust, you can add a try-except block around this code to prevent it from crashing your Draft template, as follows:

#Sample input
directoryName = "sequence_shot"

#Initialize our output variables, in the worst-case they will be empty
sequenceName = ""
shotName = ""

try:
    splitResult = directoryName.split( "_" ) #split on underscores
    sequenceName = splitResult[0] #first part of the split is the sequence
    shotName = splitResult[1] #second part of the split is the shot
except IndexError:
    pass