This post addresses some of the issues associated with the management of procedural audio assets.
With procedural audio gaining traction, there is a lot of talk about the models, the tools and the engines. We have been lucky enough at Tsugi to be involved in many procedural audio projects over the past few years. With our growing collection of patches and models, a new kind of problems – still rarely addressed – has arisen: the management of these procedural audio assets.
Good Digital Asset Management (DAM) is critical to streamline your workflow and ensure that you can continue to use your assets through technological changes. We review below some of the challenges we encountered when these assets are procedural audio models or patches. If you are also confronted to this kind of issues – or think you will be – feel free to contact us to learn more about the solutions we developed and see how we could help you.
In the following we refer to a model as a specific procedural audio algorithm and to a patch as a set of parameters associated with that model.
Searching for a patch
First, how do you look for a procedural audio patch? What is the equivalent of a sound database for patches? You could for example use a generic database that will allow you to attach the patch (usually a light file) as metadata. It becomes possible to associate tags and keywords to each patch, but this is both cumbersome and error-prone. Moreover, it may be more interesting to find a patch based on its audio properties, on the synthesis modules it uses, their parameters or their arrangement.
Paradoxically enough, some features for the future of real-time audio can be borrowed from old patch librarian software from the 90s, which were somehow similar since they were storing parameters for a sound synthesis algorithm rather than the sounds themselves, and provided ways to classify and find such patches.
SoundDiver was one of the major universal sound librarians for synthesizers.
Browsing patches
You may also want to simply browse your collection of patches. For graphic files, the operating system will usually show you thumbnails, which makes it easy to find the right file. Unfortunately, there is no such built-in system for audio. At Tsugi, we devised a solution that interfaces with Windows at low-level and can create thumbnails (either waveforms or spectrograms) for your audio files. It can also display additional pertinent information about the audio files, such as sample rate, bit depth, number of channels, looping points, metadata and so on… Internally, we can extend QuickAudio to display information about other type of files, such as procedural audio patches. Among the extra information that can be displayed in that case is the CPU cost of a patch since, contrarily to sample playback, it is not fixed for procedural assets.
QuickAudio from Tsugi displays thumbnails for your audio files.
Another of Tsugi’s tools, DataSpace, uses machine learning to organize files in a 3D world, with similar ones being placed close to each other. Although DataSpace is usually demonstrated with audio files, it works equally well with text files (we actually had it organize all our documents and everything was instantly and successfully categorized into invoices, release notes, license agreements and so on…). DataSpace can therefore be used with procedural patches stored in a text-based format (XML for example) in order to organize them and quickly find similar ones.
DataSpace can organize audio files but also text-based patches by similarity.
Previewing a patch
The preview of a procedural asset itself also requires careful consideration. When dealing with audio samples, we can simply double-click on a file and the system will play it. With procedural audio assets, is the structure of the patch (i.e. the model) the most important feature or is it the sound produced by a given set of parameters? Ideally we should be able to preview both…
Displaying the architecture of the patch can quickly become complex if it is large or composed of multiple sub-patches, if some of its modules are not installed on the machine on which we try to preview it etc… Playing the patch is not straightforward either, especially if the model allows for random ranges for the parameters so that the resulting sound can vary greatly between two sets of parameters. In both cases, some user interface elements should be developed in addition to a default behavior (e.g. displaying the top patch layer, playing the default parameter values).
Ensuring the perennity of the patches
An important aspect of a good digital asset management system is to ensure the perennity of the assets. How can we backup our procedural assets and make sure that our models or our patches will still be usable 5 or even 10 years later? Contrarily to audio files, procedural models and patches are linked to tools or engine versions. We need to make sure that the procedural audio system we are using will still be available many years later (even better, supported!) or can at least save its data in an easily-readable, non-proprietary file format. This clearly demonstrates the need for a common file format. In the same way that we can drop a sample file in any audio software and it will be instantly recognized, in the future we should be able to share a procedural model between applications.
This concludes this short introduction to the management of procedural audio assets. If you need any help with procedural audio in general, feel free to contact us!