Category Archives: Uncategorized

Google knows the value of Vector Space Mathematics

http://www.technologyreview.com/view/519581/how-google-converted-language-translation-into-a-problem-of-vector-space-mathematics/

How Google Converted Language Translation Into a Problem of Vector Space Mathematics

To translate one language into another, find the linear transformation that maps one to the other. Simple, say a team of Google engineers

Computer science is changing the nature of the translation of words and sentences from one language to another. Anybody who has tried BabelFish or Google Translate will know that they provide useful translation services but ones that are far from perfect.

The basic idea is to compare a corpus of words in one language with the same corpus of words translated into another. Words and phrases that share similar statistical properties are considered equivalent.

The problem, of course, is that the initial translations rely on dictionaries that have to be compiled by human experts and this takes significant time and effort.

Now Tomas Mikolov and a couple of pals at Google in Mountain View have developed a technique that automatically generates dictionaries and phrase tables that convert one language into another.

The new technique does not rely on versions of the same document in different languages. Instead, it uses data mining techniques to model the structure of a single language and then compares this to the structure of another language.

“This method makes little assumption about the languages, so it can be used to extend and refine dictionaries and translation tables for any language pairs,” they say.

The new approach is relatively straightforward. It relies on the notion that every language must describe a similar set of ideas, so the words that do this must also be similar. For example, most languages will have words for common animals such as cat, dog, cow and so on. And these words are probably used in the same way in sentences such as “a cat is an animal that is smaller than a dog.”

The same is true of numbers. The image above shows the vector representations of the numbers one to five in English and Spanish and demonstrates how similar they are.

This is an important clue. The new trick is to represent an entire language using the relationship between its words. The set of all the relationships, the so-called “language space”, can be thought of as a set of vectors that each point from one word to another. And in recent years, linguists have discovered that it is possible to handle these vectors mathematically. For example, the operation ‘king’ – ‘man’ + ‘woman’ results in a vector that is similar to ‘queen’.

It turns out that different languages share many similarities in this vector space. That means the process of converting one language into another is equivalent to finding the transformation that converts one vector space into the other.

This turns the problem of translation from one of linguistics into one of mathematics. So the problem for the Google team is to find a way of accurately mapping one vector space onto the other. For this they use a small bilingual dictionary compiled by human experts–comparing same corpus of words in two different languages gives them a ready-made linear transformation that does the trick.

Having identified this mapping, it is then a simple matter to apply it to the bigger language spaces. Mikolov and co say it works remarkably well. “Despite its simplicity, our method is surprisingly effective: we can achieve almost 90% precision@5 for translation of words between English and Spanish,” they say.

The method can be used to extend and refine existing dictionaries, and even to spot mistakes in them. Indeed, the Google team do exactly that with an English-Czech dictionary, finding numerous mistakes.

Finally, the team point out that since the technique makes few assumptions about the languages themselves, it can be used on argots that are entirely unrelated. So while Spanish and English have a common Indo-European history, Mikolov and co show that the new technique also works just as well for pairs of languages that are less closely related, such as English and Vietnamese.

That’s a useful step forward for the future of multilingual communication. But the team says this is just the beginning. “Clearly, there is still much to be explored,” they conclude.

Ref: arxiv.org/abs/1309.4168: Exploiting Similarities among Languages for Machine Translation

 

Vectorave Operating Procedures, etc.

Vectorave A/V is a low-bandwidth reference-based compression for broadcast and reception. This concept was created to facilitate the improvement of virtual reality and to address the diminishing bandwidth limits of radio and internet.

Process:

1) Maintain a common database(1) of vectorized objects, images, skins, textures and persons and load onto both the media management server/transmitter(3) and media management server/receiver device(4) prior to retail purchase.

2) Create universal internet update service(5) that downloads all newly identified vector objects to all receiver servers as they become available or as the are needed.

Vector-based Video Pre-production:
3) Thoroughly scout(6) and videotape all scenes to be shot for all objects in future production. Identify and log(7) all objects in scene.

4) The AVIE(8) creates a rough draft 3D interpretation of all scenes for the upcoming shoot.

5) The authorized art direct modifies the 3D scene to remove all traces of production equipment(9) and then adds an artificial illusion of an adjacent scene(10) that complements the main set.

5) All talent(11) and props(12) are imaged for inclusion in the production.

6) The more objects identified and vectorized prior to a production the less processing will be needed for broadcast.

7) For live events, vector interpreting production assistants (13) identify unknown AVIE flagged objects(14) in real time as the technical director(15) previews(16) or takes camera shots.

8) As stereoscopic video(17) is shot for a production the AVIE creates a real-time 3D map(11) of all animated objects and persons(18) within each scene using pre-vectorized object data(19).

9) As the footage is shot it is sent to the AVIE database(20) to be further interpreted. Any AVIE unidentified objects are flagged for future human identification(21).

10) After all scenes and safeties(22) are shot the footage data(23) is sent to a post production object identifier(24). This person reviews all AVIE flagged objects and identifies and logs each previously unknown object.

11) The editor(24) arranges all scenes and shots for the best presentation possible. From this point on all views, angles and camera shots(25) are infinitely variable.

Vector-based A/V uses an object recognition processor or interpreting engine and isolates an object in a scene over a any given period that it exists in the video. The interpreting engine then quickly generates random 3D objects fitting the fuzzy criteria of the object given dimensions and relations with reference to identified objects surrounding it. The engine also looks for reference shadows, reflections and other trace inter-courses with identified objects in the unidentified object’s proximity. Similar 3D vector objects are randomly generated on a matching linear video timeline until a shape most closely resembles the unknown shape. The shape is then compared with all existing object indexes and objects until a match is found. If no match is found the the object is identified as unknown and flagged for identification by a Audio Vector Interpreting Operative (AVIE).

Challenges of a vector-based production
a) Gigantic Polygon Meshes

1 – Scouting for vector-based recordings (video and audio) is done by a vector scout. The scout creates close-up video of all objects and scenes that are expected to be in the production. The scout then processes that video with the vector A/V interpreting engine and identifies and logs all XYZ objects in the scout video.

2 – The vector A/V interpreting engine initially identifies the objects it can interpret using character recognition routines, the pre-identified objects entered by the vector scout and an extensive database of pre-installed reference objects.

Any object that cannot be identified is flagged within running frames as an unknown entity and placed on the vector matrix at an XYZ location best represented by AI deduction objects suitably recreated using the production video, raw footage, safety shots and the original scouting video.

4 – Posting vector-based productions is done by a vector interpreter.
a) After a production is recorded it initially takes an operative to decipher and identify unknown objects and sounds which were misidentified or missed by the initial vector scout.

5 – Correspondence problem: “Given two or more images of the same 3D scene, taken from different points of view, the correspondence problem is to find a set of points in one image which can be identified as the same points in another image. A human can normally solve this problem quickly and easily, even when the images contain significant amount of noise. In computer vision the correspondence problem is studied for the case when a computer should solve it automatically with only the images as input. Once the correspondence problem has been solved, resulting in a set of image points which are in correspondence, other methods can be applied to this set to reconstruct the position of the corresponding 3D points in the scene.”

“The correspondence problem typically occurs when two images of the same scene are used, the stereo correspondence problem. This concept can be generalized to the three-view correspondence problem or, in general, the N-view correspondence problem. In the general case, the images can either come from N different cameras which depict (more or less) the same scene or from one and the same camera which is moving relative to the scene. An even more difficult version of the correspondence problem occurs when the objects in the scene can be in general motion relative to the camera(s).” From Wikipedia

“A typical application of the correspondence problem occurs in image mosaicing — when two or more images which only have a small overlap are to be stitched into a larger composite image. In this case it is necessary to be able to identify a set of corresponding points in a pair of images in order to calculate the transformation of one image to stitch it onto the other image.” From Wikipedia

(25) Establishing Shot, EWS (Extreme Wide Shot), VWS (Very Wide Shot), WS (Wide Shot), MS (Mid Shot), MCU (Medium Close Up), CU (Close Up), ECU (Extreme Close Up), CA (Cutaway), (OSS) Over-the-Shoulder Shot, Noddy (Reaction Shot) Shot, POV (Point-of-View Shot) and Weather Shots

 

Deductive Transition Algorithm

Predicts the 3D movement of an object from one point to another using the logistics of the object its ancillary parts and the current trajectory paths of the sum parts.
I made a note of this algorithm in the past. The wording actually sounds too sophisticated to be mine, however, I can no longer find it on the internet. Advanced circumstantial deduction is a very important key to the success of Vectorave A/V.

What is Vectorave AV (data compression and transfer)?

Vectorave TV is a combination of small vector-based instructions that reference a shared database of common elements, such as 3D objects, shapes, sounds, patterns, voice prints, skins or textures. Each Vector AV viewer or participant has a computer with a database of vector objects that is updated regularly from central locations, as new objects are identified and added. Processing of video or images, involves multiple sources.

Video and images can be recorded directly as holography. The holographic video or images are then scanned to find matching familiar objects in the shared database. All objects that do not match are either replaced by the computer by textured one dimensional images in an XYZ plane, or are sent to human interpreters for identification and conversion to 3D. An Examples of the huge vector-based holographic objects would be appliances, cars, know architecture, familiar movements or paths, variations of plant life, and complex human and animal shapes that can be altered to represent live actors.

Two dimensional video and images can also be recorded or scanned and converted after being identified or matched in 2 dimensions with 3 dimensional objects. The computer then places the 3D object in a 3D plane and runs original matching pans and zooms with simulated deductive trial and error pans and zooms until the XYZ location of the object in 3D matches the pan and zoom results of the 2D object. Static images can be placed less accurately in 3D photos using deductive size and location of an object. Some of these objects will need to be placed by human interpreters until effective interpreting algorithms are written.

 
Vectorave Open-Source HoloTelevision

 

The author of this webpage has the grand dream of inviting different specialists and students to participate in an open-source collaboration that will allow us all to bypass commercial interests and develop a Holographic Television process that surpasses the capabilities of all present technologies.
Here are other websites addressing HoloTV issues:

http://holo-tv.com/

http://www.dvice.com/2013-6-26/holographic-tv-might-be-closer-you-think
 

Here are the open-source Holo-TV issues and components I wish to address:
                                   

1) The Basic Concept of Holo-Vector Processing

2) The Necessary Software

3) The Necessary Hardware

4) Standardization

5) Implications of Virtual Reality Television

6)  The Final Goal of a Believable Virtual Reality
My Holo-TV concept is called Vectorave A/V, a reference-based
form of broadcast and reception. Here are many of the factors
I have considered in the contemplation of the concept:
The 4 basic categories of Vectorave A/V are:

                                                a) Video
                                                b) Audio
                                                c) Virtual Reality
                                                d) Generic Data
                                                (I also consider Vectorave A/V to also

                                                be an ideal data compression concept)

Here are more issues and components to consider 

regarding VectoraveA/V:

1)     Prior Art

    a)  RaveGrid

     https://www.svgopen.org/2007/papers/RaveGrid/index.html
(Raster to vector Graphics for image data) version 2.5* was the leading image vectorization and image segmentation application available at one time,taking raster images and turning them into smaller, editable vector images in the SVG format.

    b) Star Trek NG Identity Crisis
(a clear example of Holo-Vector Processing)

2)    Production Strategies

3)    Gigantic Polygon Meshes

4)    Vector Processing

5)    Hidden Markov models

6)    The Viterbi Algorithm

7)    Real-time motion blur and echo de-vectorization

8)    Lowest bandwidth date delivery

9)    Polylines

10)  Polygons

11)   Circles and Ellipses

12)   Bézier curves

13)   Bezigons

14)   Automated 3D modeling

15)   In 3D computer graphics, vectorized surface representations are
    most common (bitmaps can be used for special purposes 

          such as surface texturing, height-field data and bump).

16)     Raster to vector conversion