As the number of upstream sources and downstream channels for digital assets reaches an all-time high, the notion that DAMs hold the single source of truth is becoming an increasingly complex concept to manage for both users and vendors of Digital Asset Management solutions alike.
For DAM to maintain control over an ever-broadening scope of digital assets, its role as a centralized source from which all content flows needs to expand accordingly. It is almost universally understood now that DAMs are not simply digital warehouses. Recent developments have seen an increased willingness by DAM vendors to accommodate interoperability between once-isolated systems and integrate a new wave of emerging technologies into their platforms.
One of the most significant potential advances in DAM is the introduction of Artificial Intelligence. The level of innovation and speed with which the technology is progressing has been genuinely impressive. Meanwhile, the implementation of AI into the core workflows of DAMs remains quite basic and superficial. Flocks of vendors eager to ride the AI gravy train have rushed to incorporate auto-tagging, yet the results have been underwhelming, and very few DAM systems have advanced beyond the most basic AI offerings.
However, as the technology advances, a number of improved AI-assisted DAM features are now beginning to appear.
AI-powered tagging systems use contextual understanding to improve the accuracy of tags. For instance, understanding that a picture of a "beach" often includes "sand", "water", and "sun" can help the AI generate more comprehensive and relevant tags.
Deep learning models, trained on vast datasets of labelled images and videos can recognize a wide range of objects, scenes, and activities. When an asset is processed, the model identifies and labels the components within it.
The issue of subject-specific metadata (which is the mainstay of most commercial implementations) continues to be handled in a non-optimal manner, but some gradual progress is being made. Further, with the right contextual training, access to metadata across the digital asset supply chain and effective metadata management strategies being put in-place by DAM users, this long-standing issue may well also begin to finally get resolved.
Using generative AI, visual search allows users to discover assets based on visual elements rather than depending on text-based metadata. A growing number of DAM vendors are now offering visual search capabilities, either developed in-house or using third-party technologies.
Traditionally requiring a judicious human eye, this feature uses AI to automatically crop images whilst focusing on key elements or subjects to preserve visual relevance. This allows automated image editing for various channels, platforms and devices.
AI algorithms, particularly those based on deep learning, extract features from the digital asset. In the case of images and videos, this might involve identifying shapes, colors, objects, and scenes. For text documents, natural language processing (NLP) techniques are used to extract key phrases, entities, and concepts.
AI and speech recognition are now able to generate accurate transcriptions of dialogue and captions within video and audio files. In turn, this descriptive text can be used to provide metadata and thus improve context and discoverability.
With the advent of multimodal Large Language Models (LLM) that can be trained using multiple data types, DAM systems are now poised to take advantage of a more bespoke form of AI that can be trained using a company's existing digital assets, documents, products, analytics data, marketing resources and even brand guidelines. Not only are the digital assets potential training sources, but also the metadata generated from the upstream digital asset supply chain such as briefing documents, emails and even audio transcripts of meetings about marketing campaigns.
Such a broad yet tailored AI model could not only recognize, catalog, tag and apply highly descriptive metadata to digital assets, but also retrieve assets based on meaning, and generate completely new content based on natural language prompts such as "show me corporate images that contain the company building", or "create a brand-compliant and family-friendly marketing banner promoting our latest product".
Filling out metadata forms can be a long and laborious process, which often leads to errors and empty fields, ultimately resulting in undiscoverable and under-utilized digital assets. Given the available technologies, it may be tempting to hand over the reins to AI to automatically generate metadata.
However, as accurate, concise and relevant metadata is crucial to a DAM operating efficiently, it is wise to incorporate a level of human oversight and governance. If anything, the advent of AI and its tendency to lack context has made human-verified metadata even more important.
Although the popularity of Generative AI (GenAI) platforms such as DALL-E, Midjourney and Stable Diffusion might suggest widespread adoption, there are a surprisingly small number of DAM vendors that are offering their users the ability to create visual assets from scratch. The reasons behind this hesitation may be due to a number of immediate obstacles: ethical and legal implications surrounding ownership and copyright; technical complexity and the risk of disruption; and the added cost of providing resource-intensive services. With that said, third party tool vendors who offer GenAI which has been trained on sources that do not infringe copyright or use unauthorized biometric sources are now emerging.
A further interesting application of GenAI technology could be the adaptation of existing images to different contexts, such as local features, while still retaining the visual style and composition of the original visual asset, thus keeping the generated output on brand.
Aside from the challenges outlined above, the prospect of offering DAM users the ability to generate or modify brand images, marketing material or product shots on-the-fly by merely entering text descriptions is clearly an appealing one.
As these new digital shoots emerge, there's been a corresponding groundswell in the number of companies offering to fertilize them with custom training sets and large language models that are business-specific, on-brand and free from any copyright restrictions.
Once these models are trained, the time and cost of instantly generating a product image using GenAI will be significantly less than the resources required to organize a photo-shoot, hire photography and lighting equipment, arrange locations, logistics and transport, and liaise with numerous personnel.
The technology to create new photo-realistic images and video already exists, as is the ability to replace or generate backgrounds, faces, objects, lighting, shadows, reflections and caustics. Many DAMs can already resize and reformat images for different output destinations, and with this, the toolkit is almost complete. The final component is perhaps the most important – injecting some warmth into the cold heart of our AI technology stack. For this, we need some humans.
In order for GenAI to be able to accurately portray a brand's visual personality and not just churn out bland imitations, users will be required to master the art of prompt engineering – the construction of targeted, relevant and often very precise descriptive text instructions. Effective prompts will need to guide the model by providing a more detailed set of brush-strokes in the form of hints on sentiment, photography, film and lighting techniques, color palettes, composition, and stylistic requirements.
This human touch, combined with a custom LLM and multimodal model trained on the organization's own data is likely to become the next big trend in bespoke digital content creation.
Prompt: "An ornate green glass perfume bottle in front of folds of purple silk, dramatic lighting, 35mm, bokeh, narrow depth of field." Stable Diffusion Model: Absolute Reality version 1.8.1.
By combining the power of AI and translation APIs, DAM systems are now in a position to both generate and search relevant, highly accurate and multilingual metadata, keywords and descriptions.
Some of the best translation APIs suitable for DAM integration include:
Google's cloud translation service supports over 100 languages and offers real-time language detection, translation and batch capabilities. As with most third-party service integrations, interaction is achieved via their API.
Launched in 2017, Germany-based DeepL is hailed as one of the most accurate translation platforms available and features on the Forbes’ 100 best cloud companies list. With AI-assisted translations offered in over 30 languages, a robust API and numerous integration options, it is a flexible, secure and scalable enterprise-level solution.
Microsoft's translation platform seamlessly integrates with Microsoft Azure, which a number of DAM vendors are already using for storage purposes. The translation platform supports artificial intelligence and machine learning features and also offers a flexible API usage model.
Amazon's translation services, much like Microsoft, already offers tight integration with AWS (Amazon Web Services) – another platform that is commonly used by DAM vendors to provide cloud storage and third-party services. Amazon Translate supports over 55 languages and offers real-time translation, batch processing, and custom taxonomies for industry-specific translations.
With the intersection of affordable 3D-printing, Generative AI, and smartphone capabilities, 3D assets now represent a growing percentage of digital content. Considering the explosion in mobile game development, virtual reality (VR) and augmented reality (AR), a very small number of vendors cater for the growing demand to effectively manage 3D digital assets.
Like other relatively new technologies, such as GenAI, providing support for 3D files within DAM poses its own set of challenges. Vendors need to decide which formats to support, and as a bare minimum, users will expect to be able to upload, manage and preview 3D objects in real-time. CPU and GPU resources will also need to be considered, along with robust and secure storage of large files.
Today, beyond a single source of truth for their assets, most brands are searching for a DAM system to manage the localization of their assets on two different levels. The first one is to localize the global design that differs from one country to the next. Users may need to adapt branding elements, replace assets or customize the text for specific product campaigns. The format of documents may also need to be different depending upon the target output channel, from traditional print to digital banners for social media and e-commerce platforms.
Secondly, translation management represents a key milestone in marketing products in different countries, with each territory having its own set of unique regulations and challenges including language, layout and text size. Some DAM systems are capable of handling these issues via dynamic templates that contain editable text and image blocks. In order to effectively manage automation, stakeholders should be able to designate a master asset (e.g. InDesign or Photoshop) that can be used as an adaptive template for localizsation across different countries, languages and output channels. New AI technology could also enable direct translation of banners without having to deal with the heterogeneous source file formats.
The various AI technologies that are now emerging are gradually becoming more mature and usable in a production environment. The investment that forward-thinking DAM technology providers have made into integration with third party technologies and Digital Asset Supply Chains will enable them to leverage many of their benefits via AI.
Increasingly, DAM systems will be the coordinators or mediators of AI-driven activity. The classic use cases for DAM: cataloging and search, will remain, but their significance will reduce. The role of DAM products will be more of a strategic one as an orchestrator rather than an asset warehouse with a search facility.
This has important implications for both vendors and end-users of DAM technology who must shift their mindset away from a file-centric perspective and more towards a holistic Digital Asset Supply Chain approach.