Decoupling the filenames in a pipeline

What is the best approach to decoupling naming schemes form the art asset pipeline?

This is my 3rd project as a TA and once again I am finding myself taking too much time to correct the names of things.
It would be nice if I could orgnaize a pipeline that simply doesn’t care what the artist names stuff
-but can still export successfully to the correct places and hook up the necessary data to implement the assets.
“Avoid naming scheme driven pipelines” is now at the top of my list of future project advice I will follow.

but I am not sure exactly what approach is best and what will serve my team the best.
I have little to no experience with things like SQL and data bases, but not afraid to learn is it’s a good investment.
I have Max script, C# and some python under my belt.

My basic concern is simply routing source file (.MAX, .TIF) output automatically to the correct export (.FBX, .MAT) location.
But also communicating to the engine (Unity ) the proper implementation of the assets (like connecting the right textures and shaders to the mesh).
Presumably you have to reference a file name at some point, so how do you ‘decouple’ that from the pipeline architechture?
There must be some sort of ‘switchboard’ tool for the data at some point ?

Generically there are a few ways this can be done. In rough order of complexity:

  1. tracking files - aka ‘sidecar’ files - that contain correlation information. For example, every game asset might have an asset file which included paths for models, textures, animations, etc so that other tools could read and use that data rather than relying on naming or location conventions.

Pros:
explicit rather than implicit
version controllable / revertable
tools don’t need to know conventions

Cons:
you have to manage a lot of new files
need to provide users tools to keep tracking files up to date - user’s won’t do it manually
garbage in = garbage out

Complexities:
need to make sure that the tracking files are properly controlled by source control

  1. Database. In this version you track the same kind of information as (1) in a database, rather than in a file. This provides fast access and is great for analysis - it’s easy to ask things like ‘how many textures are referenced in this level’ by walking all of the prop references in the level. The downside is that you now have two views of the game: the one in source control and the one in the database. Over time they stay roughly in sync but minute-to-minute they are usually diverged (sombody else might udpate the DB before checking in the files referenced , for example)

    Pro
    Fast
    Good global overview of the game state
    Easy to update globally
    Good analytics
    Con
    Database view is rarely 100% synced with any particular local client view
    Need tools to update the DB - user’s wont do it manually

  2. “Wrapper” files. Assets are stored in ‘envelope’ files which include the kind of metadata referenced in (1) and (2). The ‘real’ data has to be extracted from the envelopes by special tools. This makes sure that the metadata and the ‘real’ files are always in sync.

    Pro
         no divergence between metadata and real data
    Con
         all file access has to go through a custom API, so it's hard to include off the shelf tools

Thanks , Very useful summary!
(btw your book is on my Xmas list!)

Option 3 seems like what Unreal does with its UPK files…

I think Rob has an old GDC presentation somewhere here on the site which covers # 3 as well

would that be this PDF?

Yep!

Great post, Theo (and Rob’s paper!). I think the less rig and animation-centric TAs and developers always appreciate seeing tools and methodologies that may apply to other aspects of TA development – asset management and file tools, automation and general pipeline optimizations, etc. Thanks!

The PDF sample code slides were so small and crappy rez as to be barely legible.

I found a zip file of the complete presentation here

Actually everything at http://tech-artists.org/downloads/ is gold

if you search for US patent 7904877 from 2011, file by Microsoft, you can see their idea about a sidecar file containing metadata for a game content pipeline

We use Perforce attributes for storing/revisioning all metadata for the pipelines, and index that data into a local SQLite database for fast searching. Very happy with it, so much better than naming conventions. At the lowest level our only file naming requirement is that each base filename is unique across that project, since that’s its unique identifier. Because the full path to that file is also in the DB, we don’t even really care where that file is. As long as a tool knows the base filename, it can quickly locate it along with any desired metadata.

Why did you choose to use perforce metadata and not store that data in your SQL db?

By putting the metadata in Perforce attributes it’s automatically revisioned, just like changes to the file itself. That means we can sync back to labels, earlier dates or changelists and know that the files and metadata always match as expected. If you store metadata in your own DB then you’re on the hook for keeping that aligned with the actual file revisions. I mean Perforce itself is a database… there’s just no downside to also using it for metadata.

Before we stumbled onto P4 attribues we actually did have our own DB storing metadata. It quickly became a mountain of code and still had too many points of possible failure. So we looked around for something else.

Since a P4 fstat is needed to query attributes, and that command is relatively slow, we felt the need to keep the local SQL DBs around to make searching from tools code faster. And I’m glossing over some details on how we keep that up-to-date, but treating Perforce as the “authority” for metadata was still a big win for us.

I have read through Rob’s Paper, and I am still trying to completely grasp the idea of a wrapper “sidecar” object

“…wrappers, are the code representaion of content-this is how we directly manipulate content‐ export, edit, compress, etc. It contains that logic. So a Max file would have a wrapper object, a texture file has a wrapper object, a Material has a wrapper object, etc.”

does the wrapper actually contain the literal Asset data?
or does is only hold functions for manipulating the Asset as a seperate file ?

So for example I were to code a wrapper for crate_mesh_01.max
would all the geometry , verts, faces, normals, UVs etc be encoded somehow within the wrapper?
or would the wrapper simply contain a reference to the file path of crate_mesh_01.max
and some functions to manipulate it?

public class maxWrapper {
	...
	private string maxFilePath;
	// a reference to the 3DS Max file
	//or would this file path be in the scope of the "information object"?
	
	...

	public void Edit(){
		//code to launch 3DS Max and open maxFilePath
	}

	public void Export(string exportedFilePath){
		//code to launch 3DS Max, open maxFilePath, and export to FBX/OBJ or whatever
	}

}

Does it seem like I have the right idea?

I would assume it is both and would act like a normal code object where it carries data being the meta data and also has methods it inhertis to work with said data.

I think I remember the doc mentioning that data is rebuilt (or applied) to the scene from the sidecar file with maxscript, so yea i assume it’s encoded with basic data (vert positions let’s say) and then that data is applied on ‘import’ or load. That way you could potentially edit the wrapper data outside of max and then apply the data on load/import.

I totally may have this wrong though (haven’t read it in a while).

Keep us posted on your success, I’m very interested in this. I’m in a similar position wanting/needing to implement some fundamental structures to our pipeline, also using Max, c# (and if needs must… maxscript).

hmm if the wrapper object is to contain ALL the data, and not simply just point to a max file,that could take quiet an effort to set up.

Yep, agreed. However, smaller chunks of more specific data that you know you want to read/edit could be more manageable… so simple mesh, texture or animation data.

I realize “know you may want” is a bit of a problem statement in itself… it’s always difficult to know what you need up front.

Again, I’m not sure if this is how it’s actually outlined, or implemented, in the talk so I may be misrepresenting it. I’d also love some clarification.

It’s not that difficult to have a wrapper file contain all the data. You can do it by just serialising another file format into your wrapper file (e.g. collada, TGA etc.). The wrapper file doesn’t need to know anything about what’s in that block of data it just stores additional meta data about it. Then when you want to load that model (or whatever) data you can just extract that the block of data and pass it to an existing loader for that format. Preferably you’d pass the data in memory but you could even write it out as a temp file of the original format and point the loader to that if that’s what it expects.

I guess Rob recommends and incremental approach

Wrap whatever you can. Don’t accept parsing XML blindly. Don’t accept files or content that you treat as ‘dumb’. Everything can have, and will eventually need, logic. Don’t try to wrap everything at once, do it as you need it. But do it as soon as you need it, because you’ll need it again. Otherwise, you’ll waste time putzing around trying to get your ‘dumb’ files to do ‘smart’ things that would be easy in the OMP.

He speaks of hating binary files, wrapping things as ascii and xml, so I assume Bioware is(was) using a process that exports the Assets as one of those two.

My impulse would be JSON over XML as I’m more familiar and it’s used heavily on the deployed side of our projects.

I wonder if FBX ascii as a string value would be effective (for meshes and skinned meshes) in this case?

[QUOTE=Adam Pletcher;26275]At the lowest level our only file naming requirement is that each base filename is unique across that project, since that’s its unique identifier.

And I’m glossing over some details on how we keep that up-to-date, but treating Perforce as the “authority” for metadata was still a big win for us.[/QUOTE]

More please :slight_smile:

How are you approaching the file lookup? If they have a unique file base name, how is that quickly looked up on the filesystem? What if an artist moves/deletes a file?