The goals of this project
1) to reconstruct the original sound as accurate as possible
2) to transform the sound as diverse as possible
The tools used in this project
Tools invented
- steady-sound.lisp (by Tak)
- a sample lisp code to pick up a frame of spectrum and make a sound
all of whose frames are made the same as that particular frame in the original.
- presentation
- loops.lisp (by Harvey)
- a lisp code which creates a looped version of the original sound.
- express.lisp (by Tak and Harvey)
- ATS functions for superimposing vocal-like expressiveness on ATS-formats
sounds.
- 1) "vibrato-sound" is a multifaceted tool designed to give
vocal-like expression to an ATS sound. Vibrato and amplitude envelopes
are applied over the duration. That is, the vibrato is a time varying effect.
- 2) "transfre-specenv" allows you to do some frame-wise modifications,
like mophing.
morphing example: takuya-a-i-transient.snd
- Through the obsrevation of actual diphones, it proved that the formant
behaviors at vowel transients are very complicated. So I gave up the idea
to imitate them and simply tried a formant cross-fade in this example.
Accessaries invented
- wave2lisp (by Tak)
- a dirty shell script for file conversion automation: It automatically
issues commands to convert a sound file to an analyzed data file. Note
that this works only for sound files prepared for this project which are
under specific naming convention.
- idlenext (by Tak)
- another dirty shell script to find an idling black NeXT around here.
Er wird den hoeren, ,,Wachet Auf! Schlafet Nicht! Werket Hard!
- playit.lisp (by Tak and Harvey)
- a tiny lisp function to hear an analyzed data object. It also applies
optimizing function if the object is not optimized yet.
- fins.clm (by harvey)
- a non-real-time vocal instrument which includes the envelope-controlled
vibrato-rate & -depth, structured by 3 formants with information of
center frequencies and Qs.
Procedure
Diagram
(by Ching)
1) Several English vowels are recorded.
** All of the sound sources of this project are located in '/usr/ccrma/snd/fujishim/mus220b/'
directory.
2) Some of the recorded vowels are looped in the wave domain by using
SoundDesigner. (by Unjung and Ching) They are the followings:
| Female: |
/a/ in pitch 'a'(440hz) |
l-uaa-mix3.aiff |
|
/a/ in pitch 'd'(293hz) |
l-uda-mix3.aiff |
|
/i/ in pitch 'a'(440hz) |
l-cai-mix3.aiff |
|
/i/ in pitch 'd'(293hz) |
l-cdi-mix3.aiff | |
| Male: |
/a/ in pitch 'c'(131hz) |
l-tca-mix3.aiff |
|
/a/ in pitch 'g'(175hz) |
l-tga-mix3.aiff |
|
/i/ in pitch 'c'(131hz) |
l-hci-mix3.aiff |
|
/i/ in pitch 'g'(175hz) |
l-hgi-mix3.aiff |
2) smsAnal is used in 'wave2lisp' in order to analyze the original sound.
Here are some explanations on the parameters. smsflags.
3) smsToATS is used to transfer .sms files to .lisp files in order to
resynthesize sounds or do the transformations of them in ATS environment
Steps
- Good point: capable of manipulating in spectral domain --- which brings
flexibility, such as very very steady sound from an original with fluctuation.
- Bad point: smsAnal is hard to control; it's really jard to find a good
parameter (flags) set. We haven't been successful in capturing bright vowels
as they are. Besides, smsAnal takes a long time in analysis. a 10 second
wave file takes about 30 minutes to be analyzed.
Perspectives:
1) morphing-- for example, from vowel "a" to "i,"
or from male voice to female voice.
2) more kinds of transforming sounds