I’ve done it lots of ways, and they all have flaws. However doing something is way, way better than doing nothing. Make all of your knowledge about how things work into data structures and code instead of wierd inferences (“up two folders, down into “animation”, and find the file with the same first six characters and today’s date”). Allow tools to work on explicit knowledge instead of naming conventions, location rules, or bugging users for input (like:
“I want to build a level - what assets do I need? Please sync them and build!”) not (“sync these 4 folders and those two files before rebuilding the level”)
The tricky question is where to connective tissue - the metadata - is going to live.
Disk based tracking files – like unity .meta files – are easy to do and work well with source control; plus they can be versioned themselves so you can see when changes to the topology of your project change. However they can be subverted by knowledgeable users (not always a bad thing, if you’re doing some isolated experimentation) and you have to have a strong toolchain to make sure the metadata stays in lockstep with the assets you’re tracking. If you do this use a text based, human readable format (YAML, JSON or XML in order of preference) and make good libraries to abstract the details of reading and writing the actual disk files.
Databases have a steeper learning curve (unless you have prior experience) and they introduce a potential point of failure – what happens if the server goes down? – although nowadays you can just host it all on Amazon or whatever and have 99.9% uptime. They are great for collecting stats and tracking progress - the system we used for State of Decay let me produce realtime charts for memory in all the major areas of the game and publish it in our wiki as web page … until confluence broke all their SQL plugins, anyway…
The main problem with DBs is making sure they reflect the state of the actual game. The best way I’ve ever come up with it to update the DB from a perforce checkin trigger so the DB only changes when the ‘official’ state of the game changes. However that can be tricky since you may not have the knowledge you need when the trigger runs – if, say, you want to track the memory size of assets in the DB you need a way to collect that info from the model file when it’s checked in and that tracking code will be running on the perforce server, not a dev machine with all of the latest tools/code etc. SQL is also not the greatest tool for managing trees of data - while it’s certainly doable to have the db tell you “this level includes this model which uses these textures and this shader” you need some careful table design to make that snappy – a good SQL consultant is a great help, although you can teach yourself if you’ve got the usual TA gumption.
All that said this is still my preferred solution, since the data is so accessible in lots of ways - from inside tools, from the web, and from dedicated SQL tools. Plus, if you’re doing it in Python Django makes this stuff very high-productivity for you as a coder. Check out Adam Plechters GDC 2011 talk for some cool ideas of things you can do with a good database
I’ve never tried Adam’s use of Perforce attributes as a way of pushing the metadata into perforce directly, but that looks like a cool way to guarantee the lockstep of metadata and data. If I were going that route today I’d make all my key metadata into something like YAML nuggests and stick them into a single attrib on all my items, effectively making it a version of disk-based meta files.
Shameless plug: the book i worked on last year has a couple of chapters on metadata and asset management that might be of use as you ponder this.