Archive for the ‘process’ Category
Data Store 1: The Local Workstation
Overview
In this upcoming series of posts, I’m going to catalog most of the types of data stores we use at Loose Cannon, along with features that would make you want to choose one versus another.
Sometimes the lines are blurred a bit, because data stores continue to add interesting features each year. For example, modern wikis are starting to get some decent revision control features for binary data. And some revision control systems such as Subversion have the ability to instrument the file system with metadata, like a database.
Sometimes tools can bridge the gap among these data stores as well. For example, a file system has no ability to notify via email of changes, but you can pretty easily write a file scanner that watches for changes and emails team members based on a spec. Overall, though, while tools like these are useful and can enhance the underlying data store, they can’t really change its basic nature. So I’ll be reviewing each with that in mind.
This overview of data stores is probably going to be very basic for most, and you could even question why I’m bothering with such obvious stuff. Well, that’s what Ctrl-Yawn-W is for. But this is how I think, and so I can’t help it. I always start with the basic foundation and work my way up to the top. It’s slow, but regularly revisiting old assumptions is critical. As I write these upcoming articles I’ll probably end up reconsidering some of the things I do now and will make notes for changes to the next version of the tool chain for late 2009.
Anyway, back to the post. What we’re trying to do is answer the question “where does this go?” We’ll start with the first, most obvious option.
The Local Workstation Hard Drive
This is the fastest and easiest to use data store. Everyone stores at least some of their data on their local workstation’s hard drive.
For files that need to be “shared” with other people, you can copy or email files around to other people you work with. There are services that make it easier to do this for distributed groups (Groove comes to mind), but at that point we’re outside the scope of local storage and into the realm of services. Let’s stick with local.
What Gets Stored Locally?
I see local storage used a lot with spreadsheets, concept art, in-progress design docs, audio samples, task lists, quickie test projects and scripts, and so on. Private and secure data is often stored locally as well: budgets with salary information, employee reviews, private emails, and so on. I myself keep loads of private notes, tasks, and research progress in OneNote and, occasionally, email (in recent years I’ve found email to be increasingly painful and hardly use it much for work).
Ownership and management of local data is clear and simple: you own the files and you organize them however you want! Nobody messes with them unless you explicitly share a location out and, even then, you control the permissions. You can send updates to whomever you want when you want. Operating system shells are pretty good at making this as simple and powerful as possible. You can add metadata and tags, create virtual search folders, organize using apps like Picasa and iTunes, and so on. It’s great!
Those same nice shells often store irritating hidden files like Thumbs.db, Picasa.ini, .DS_Store, and so on – files that are not meant to be shared and often clutter up shared data stores. We have a special trigger in Perforce to prevent people checking in these files by accident (there’s also .user, .bak, .~, .svn\* and a hundred other siblings in this temporary family).
Local Storage Can Hurt
All of this lovely freedom can make things easy for you and a freaky mess for everyone else on the team.
Say someone leaves the company. Now you have a computer filled with files and nobody knows where anything is. Their desktop is invariably going to be chaotic, with files like “sword_large_02.psd”, “sword-marketing2008.psd”, “Copy of” files and “New Folder (2)” folders and so on. And more files stored in C:\ and My Documents and other seemingly random locations, maybe even burned to piled-up CD’s. I’ve known too many people that work like this. They run their hard drives down to 0 bytes free and then just delete random stuff to free up space. It’s Concentrated Crazy! Yet they still manage to kick out final PNG’s to the right spot in the Perforce depot, so nobody notices the mess until it’s too late.
At Sierra we had a policy of burning to CD the full hard drive of anyone who left the team. With the turnover we had on Gabriel Knight 3, of course, that gave us a mighty big stack of CD’s. Once or twice some poor intern had to go through it to (fail to) find this or that odd piece of marketing art we needed. I think all the CD’s ended up in an offsite vault eventually. All that effort archiving, storing, and searching that mess was such a ridiculous waste of resources.
Beyond the hopefully rare “developer leaves team” story, local files are simply the opposite of good communication. They’re by definition private. Even if you share a folder out, it’s still your machine. If you want to send an update of a file to someone you have to send an email or copy it to a share and IM them about it. And that’s private too, so you have to know who might need to know about these updates. This also assumes you even remember to do the update, and send the right file, and all that. I’ve lost track of the number of times I’ve gotten emails from remote workers, who attached the wrong version of the file. Oops! Sorry about that dude, let me re-send!
It’s a lot of seemingly optional responsibility to attach to a person who is probably too busy to keep it organized. At least, organized as much as the rest of the team might need.
Don’t Mess With the Crazy
I used to strongly believe that if someone leaves the team you should be able to flatten their machine and give it to someone else without worrying about losing important data. Nobody should have to decode the Crazy. So of course I was often frustrated by people on my team who insisted on working this way.
Well, I still believe this, I’ve just given into reality. Apps are written with the assumption of either working locally, or working through some kind of custom sharing service (like Office + SharePoint and even that isn’t done very well). It’s an uphill battle to try to enforce a structure and process on what amounts to an inherently personalized and optimized development experience. So many things are just easier locally. Like working with folders! You can rename them, or move them wherever you like, instantly. In contrast, doing the same through source control is a tedious pain.
There’s nothing necessarily wrong with using local storage for ad-hoc, brainstorm-y, temporary, or private work. The trick is knowing when to move things off the local drive to another, more team-friendly data store. One where people can subscribe to changes made, get their own copy, get a history, and so on.
So how to know where it goes and when to start maintaining it there instead? In writing this I’m trying to figure out some basic rules for what can stay local and what should get promoted. But it depends so much on the team, discipline, type of data, and so on. Some things are easy and absolute: all source required to build the game and its content go into source control. Some things are more gray: where does that intermediate concept art go? Where do those little prototype gameplay modules go?
For the easy ones, look to the upcoming articles. For the more gray issues, I’m going to fall back on the “make sure it’s backed up” solution. Hope for the best, and if things go bad, go to backup for recovery. But we want to avoid the expensive problem of backing everything up and having a big disorganized mess of Crazy, right?
The solution to that I think is in a detour article I’m posting next on backup options. One great way to have your team members separate out the signal from the noise on their hard drive is through a partial backup solution. More on this next!
Buried In Data
It would be an obvious understatement to say that modern game development has a lot of data.
Our hard drives fill up with all kinds of bits: tools, source code, code reviews, source and object assets, bug reports, concept art, docs, email, game builds and debug data, CD images, gameplay analysis data, SDK’s, error logs, and on and on. New types every day. Some you have initially, and some you add over the course of development. It all depends on the type and size of game.
In this next series of posts I am going to focus primarily on code and assets, and the tools and processes related to producing and managing them. That’s still a lot of data! During development, new classifications of data come up all the time, and we need to be able to answer the question of “ok great, now where does that stuff go?”
For example, let’s say that the powers-that-be decide that we’re going to add an hour of video cut-scenes to the game. This means a lot of new source assets need to be created, and then some truly huge output video files (made much worse if multiple platforms are involved).
Should everything go into revision control? Or get stored on a server file share? What about just keeping it on the video editing machine? The decision could have serious implications to team productivity, server space management, complaints from the IT department, automated builds and testing, and so on. So I want to provide some simple rules to help decide where data can and should go.
I originally was going to talk just about Perforce but in starting to write this article I realized I needed to back up a bit and talk about how we decide what even goes in revision control in the first place. In later posts I’ll talk about the particular environment we have at Loose Cannon. Specific types of data that flows around our network and how we organize it.
Upcoming Articles
I’ve been trying to get the time to do a series of posts and I think I’m finally ready to get started. The high-level theme is “Loose Cannon’s workflow”. Like any mid-size team working on a multi-year title, we’ve built up a large set of processes and tools to support development. Some are very carefully planned and based on past experience, and that’s where I want to start first: talking about mature process and tools that I know work well. Well, at least for the types and sizes of teams I tend to work on.
Note that I’m only going to talk about the areas that I’ve either designed or been directly involved in designing. We have a large infrastructure for building and managing assets that I think works pretty well. Maya, exporting, plugins, that sort of thing. But I’m not too familiar with all that. At least not yet – looks like my task while in Peru will very much involve our content pipeline.
So anyway, within this theme, the topic for the next set of articles is how we use Perforce at LCS. Which means this series will be fly-over territory for most people Google sends here. I’ve only met a few people in my life that have actually been interested in things like depot design and the workflow around it. Well, I really love this stuff! Helping the team become more efficient and accurate is something that I enjoy more than any other kind of work. Whether it’s creating API’s or tools, or setting up bugbases, or building an ecosystem for code reviewing, I’m happy doing it all. So I can’t help but post about this.
This is the third company where I’ve set up a Perforce depot and been responsible for its design, maintenance and tool infrastructure. With input in particular from Matt Scott I think we’ve got the best source control setup I’ve ever worked with. We’ve solved a lot of nagging problems that have plagued me at previous gigs. There are still some issues, but overall things are very smooth.
I plan to cover things like the design of the depot, standards and requirements, supporting tools, and server maintenance. Perforce, out of the box, is far from a complete system. I’m pretty annoyed with those people, actually. All that money we give them every year and they’re mostly standing still, too busy rolling around naked in cash to fix basic and ancient problems with their design. But that will have to wait for a future post.

