Today we are introducing vSound — our first generative music model. Describe a mood, a genre, an arrangement, or hum eight bars; vSound writes a complete track that holds together from the first hook to the final bar. It is available now inside vMira and through the API for studios.
vSound was built with input from producers and musicians. The brief was direct: an end-to-end music model that respects musicality — rhythm, arrangement, dynamics — and that ships with the kind of provenance an industry with established rights conventions can build on.
The model writes the melody, the bass, the drums, and the texture together rather than stitching pre-rendered stems. The result is tracks that breathe in the same key — and the freedom to re-prompt a single bar without re-rolling the whole song.
What it does
Type a mood. A single sentence — a summer evening in Sochi, optimistic but tired — produces a usable opener in under thirty seconds. Lock the parameters. Force a key, tempo, time signature, or instrument family; vSound respects every constraint you set. Condition on melody. Hum or upload eight bars and the model arranges the rest of the song around your hook. Stems on demand. Drums, bass, harmony, and lead can be exported separately at render time so you mix in the DAW you already work in. Lyric-aware vocals. Drop a verse in Russian or English and the line sits in tune and on the meter. Re-roll a bar. Highlight measure seventeen, ask for a variation, and the rest of the song stays in place.
Built with musicians
vSound was developed alongside producers and session musicians who shaped its training corpus, its evaluation criteria, and its refusal calibration. Russian folk, jazz, classical, hip-hop, club, and orchestral cinema entered the data with equal weight — so the model's defaults are not skewed toward a single market. Where artist consent matters — voice timbre, stylistic mimicry — we built the consent flow first and the capability second.
“A music model is only as good as the relationships behind it. We did not want a system that produced facsimiles. We wanted one that producers were willing to put their name next to.”
Provenance and safety
Every track vSound generates is fingerprinted with a SynthID-class watermark designed to survive MP3 compression, time-stretching, and capture from a speaker. C2PA provenance metadata travels with the file — model version, render timestamp, prompt hash. Lyric-aware vocals will not imitate the voice of any real artist without that artist's on-file consent. We publish the dataset families used in training and the licence tier of each, and pay royalties through the standard collective-rights mechanisms in every market we operate in.
Available today
vSound is live for everyone inside vMira on Plus, Pro, and Teams. The API is open to verified studios and labels through our developer portal — write to business@vcorp.co for a pilot. Hosted requests run inside the Russian Federation by default; cross-border processing is opt-in and documented per Federal Law 152-FZ. Pricing for the API tier and details on the on-premise build will be published in a separate note next month.
What is in the launch tier
Tracks up to three minutes, twelve genre families, vocals in Russian and English, separated stems on demand, image-to-music conditioning, and DAW export to the standard project formats. Longer-form composition (full arrangements, multi-section pieces beyond three minutes) and additional minority languages of the Federation are scheduled for the next release.
What it does not do
vSound is not a real-time instrument and is not a substitute for live performance. It does not imitate named artists. It does not generate audio that closely tracks a copyrighted recording supplied as conditioning input — the model will refuse and explain the constraint. As with our other models, we publish the limits we know about so customers know what to plan around.