the seemingly endless science fiction image generator is a project i started recently (May 0x7DA) and which will probably be used to produce some full dome content for the next domefest. the idea is principially based on a paper by the NEAT guys:
Generating Large-Scale Neural Networks Through Discovering Geometric Regularities In: Proceedings of the Genetic and Evolutionary Computation Confrence (GECCO-2007). New York, NY: ACM, by Jason Gauci and Kenneth O. Stanley.
in this paper the method is used to apply evolutionary search to very big neuronal nets but it can conveniently be used to create pure and stunning graphics and works like this:
for every pixel on screen (x,y) we give the factorized (something like in the range of -1 to 1) position as input to a small (or big) neuronal net which is constructed in an arbitrary manner, to receive the color for each pixel. Gauci and Stanley suggest using geometrical and symmetrical functions but nearly anything will do. for the pictures below i used the logistics function, sine functions, multiplexers, noise and various other type of functions that seem appropiate for image creation. the neuronal net is then calculated, means: every cell sums up it's inputs, applies it's function and supplies the result as input for other cells. finally "something" is propagated to the three output channels which determine the color of the pixel at this position.
so here's a first impression of what this can look like. ofcourse one has to choose some meaningful parameters and weights for the cells and their connections. how is this done?
before i reveal this deep secrect (if you can not guess by now) let's take a closer look at the neuronal net. for the pics on this side i worked with small recurrent nets of 10 cells/nodes on average. each cell has - say - 5 parameters (like amplitude and frequency of the sine function). that already makes 50 float numbers to tweak on. furthermore each cell is (possibly) connected to any other cell, including itself so that makes an additional 100 float numbers. 150 float numbers with a resolution of 32 bits, where maybe 20(?) are really significant for the final outcome (the pixel color), that makes 2 to the power of 3000 or roughly 10+E900 different combinations or images. it is quite clear, that no one will ever be able to see all of them. this is just to give you a small insight into the little "science fiction blackbox".
we--well at least--I don't really care what's going on inside the net today! for this task i am just interested in the output image. so the natural way is to see what the output looks like and then go from there to refine the image, best starting with a very "default", very sparsely connected network.
above you see a screenshot of the application which was used for the images on this page. the user sees no parameters, just pure *gfx*, and selects one of them to see new variations, or selects multiple networks to create cross-overs between them. the mutation rate (probability and max value) increases circularily around the cursor (top row, 3rd image). so that might give you an instant impression of how mutation of the network parameters influences the visual results.
in any case! to get nice inspiring and complex pictures some *magic* on the network programming side, espescially function implementation can do no harm. with that i mean, to use all the scientific, numerical knowledge artistically, maybe keeping an eye blind on the 'basic correctness' of one or another approach, if it only helps to enrich the image space in an efficient way.
efficiency is a good topic
but let's summarize aspects of the SESFIG first
but one might say as well
since we are working with recurrent nets, raising the iteration depth (nr of calculation/update steps of the net) results in various feedback loops inside the net. through some searching we can exploit them to create textural effects and more complex shapes. in a way, we are travelling through multi dimensional, functional space, with iteration depth beeing a multiplier for dimensionality.
still, though the image space is very complex, one can not *really* leave behind the "arbitrarity" in the composition of a larger image. or can one? --it greatly depends on the compositing functions of the network cells, on the number of cells and therefore on the number of overall parameters. that's where efficiency strikes in.
and that's where i stop my article today...
the notes on efficiency
for dome masters, normally one uses or simulates a 180 degree hemispheric camera, or fisheye camera. so far in our little scientific fiction explorer we used X and Y as input, so the final image is simply a result of some function f(x,y). for dome masters we could now simply add a Z coordinate representing the height of the dome as in this picture:
however, since we use random initial conditions and an evolutionary approach which only considers the resulting image, the neuronal cells *might* not 'care' for the correct representation in 3d space. for example, you might find a nice setup which looks cool on-screen (2d dome master) but the Z-input might not be connected at all... projecting that picture on a dome will nescessarily result in perspective distortions. but then what is the right perspective in a generated image like this? since we are not dealing with realistic 3d scenes but with an imaginative realm, the oberserver in the dome will probably have a quite individual sence of perspective of the scene anyway. there is no *ONE* solution to obtain a nice dome perspective.
in this picture, one of the example dome masters has been put onto a virtual dome. this might give some first impression how the images work, in case you don't have instant full dome projection access ;).
whatever possible input you provide to the net in order to archive a dome master is a pure artistical decission - from my point of view. in the sample images above and below, there have been various methods applied. some of them are: true X,Y,Z coordinates for every point on the dome; X,Y and distance to center; or 'spherical bent' X and Y coordinates to restrict the network to certain dome-like distortions.
animating the art mines
there are probably a lot of sine functions with different frequencies involved in the creation of the images. so one way to 'animate' an image would be to shift the phases of the sine functions. this is shown in the video below. each function's phase is incremented by an amount determined by the parameter set.
the next film however shows an actual morph between various evolved parameter sets. the blending is done in a linear fashion which looks kinda awful from my point of view but thats a quick hack anyway. in the end some sophistication is needed to design and implement the morphing scriptability.
as mentioned earlier, there is a FIXED number of parameters in the system (network), contrary to an aproach where new cells and connections appear as a result of mutation. while the latter simplifies the navigation in search space, sticking with fixed topologies makes it easy to morph between any set. for arbitrary topologies that could still be done but is much more complicated and time consuming.
making an animation clip also appealing to auditive senses, i've used a neat algorithm which 'objectively' calculates the sound for each animation frame. numerous sound sources are distributed evenly over the image, which gather the average rgb color in their neighbourhood. fortunately the rgb color scale can be quite simply transformed into a set of oscillators that are tuned to the color's frequency components.
visible light is in the realm of terra hz, or 10^12 hz. to get those frequencies into the audible realm while keeping their frequency ratios intact we have to octavize them down. transposing any frequency an octave down is like dividing the frequency by 2. the closest power of two which comes near 10^12 is 2^40, or 40 octaves. so if for example red is at about 430 thz, we can simply calculate 430(thz) x 10^12 / 2^40 = 391,083(hz).
next step is to correctly tune the oscillators to mixtures of the pure 'rainbow colors'. in the following clip the algorithm subtracts a small factor of all known, pure colors from the color in question until nothing remains, and sets the frequency and amplitude parameters of a set of oscillators accordingly. however, the range of visible light itself only spans one octave. to make the sound more interesting, lower frequencies are mixed with sub-octaves, while higher frequencies are accompanied by their octavized overtones. finally this is done for every sound source within the screen area and they are mixed down to four channels, as if in every corner of the screen a microphone would record the sound sources according to their distance to each microphone. the result can be seen and heard (as stereo mix) below.
Mozilla . 1280 x 1024 . True Color . Full Screen