VRML stands for Virtual Reality Modelling Language. First introduced at the First World Wide Web Conference in 1994 by Dave Raggett, where 3D shapes can be specified in terms of vertices and edges, along with mapped textures, opacity and so on. Pages constructed from VRML (often called worlds) can be viewed through normal browsers by installing a plugin such as Cortona3D Viewer [2] or Cosmo Player [3]. After various updates, VRML has been superseded by X3D [4] and a number of alternatives such as Google's O3D [5] and Unity3D [6].
HTML DOM
The Document Object Model (DOM) defines a standard method of accessing and manipulating XML (and HTML) documents, endorsed by the W3C [7]. HTML nodes are represented in a tree-structure, with methods that can get, change, add or remove HTML elements on-the-fly. The simplest way of using the DOM is through Javascript, which can interface directly with the DOM through commands such as GetElementById().
Case Study
How do VR browsers go about making the WWW an easier place to navigate?
From research carried out on different applications of VR to the World Wide Web three main approaches have been identified. For the purposes of comparison, three concrete examples were taken from each approach, and shall be discussed briefly. These are:
Content-centric: VR is used to organize the content of the webpage. Eg. SpaceTime3D [8]
Structure-centric: VR is used to visualize the structure of the webpage. Eg. VisualExpresso [9]
World-centric: VR is used to generate the world in which the webpage will be shown. Eg. ExitReality3D [10]
A series of questions have been formulated to attempt to gauge the effectiveness of each browser. The questions asked regarding the browsers were:
Are the controls simple to understand, and easy to use?
Is information displayed in a meaningful way in the application's context?
Is the information easy to read?
Does the application allow for a collaborative effort towards the web browsing experience?
SpaceTime 3D
SpaceTime3D deals with conventionally rendered webpages (using IE's rendering engine), however instead of having them as a flat view the application renders each webpage to a plane, which is then rendered in a 3D world. The planes can be organized in stacks which the user can then sift through.
SpaceTime3D's strength lies in visual searching, where search results are placed in a stack. This can include either webpages, images or even videos. This is in contrast to other search engines where results are displayed in text format.
SpaceTime3D also has a web component (http://www.spacetime.com) that makes the visual search available in a flash application.
The control scheme is not very intuitive and it takes a long time to get used to. Unnecessary functions abound such as the 'ability' to position the camera behind the browser windows, and about four different movement modes, which all appear to do the same thing.
The application handles its content display quite well.
The information is very easy to read.
SpaceTime3D allows the user to save and open 'spaces' which correspond to the pages currently open at the time. However this seems more of a facility for the user than a collaborative effort.
SpaceTime3D makes a sizable leap from conventional web browsing, however the controls will put most users off from making use of the application in favour of other browsers offering greater ease of use.
Visual Expresso
VisualExpresso is more of a tool for webmasters than a browser. The actual output of the application is a VRML world, which is then viewed in a conventional browser. However for the purposes of such a study the author thought it a noteworthy (and probably unique) addition to the cases currently being examined.
VisualExpresso was one of the first attempts to break away from the traditional format of displaying a web page 'and everything in it'. The application featured in the 1999 paper "VisualExpresso - Generating a Virtual Reality Internet". Even in the days of internet infancy, the authors of the paper were already citing difficulties with information overload, and detailing their answer to the problem.
The solution they proposed was to:
Examine the structure of each page on a website, identifying the hyperlinks between pages
Create a data structure to store the structure of the website (not the content)
Output a VRML file illustrating the structure of the website. The file is then displayed in the user's browser using a VRML plugin.
In itself VisualExpresso lacks an innate control scheme. However the one offered by the VRML player is adequate for the purposes of navigation. It only takes a few minutes to become accustomed to the controls.
In the sample world provided, only hyperlinks are drawn in the world. That being said it does provide a good picture of the link structure of the webpage, with internal, external and mailto links being clearly distinguishable.
The information is far from easy to read. It is hard to draw the line where one website ends and another one begins, and it actually took a while for the author to realize that there were actually multiple web pages being represented in the sample world. The large amount of criss-crossing lines between hyperlinks also present difficulties for the user to make out individual links.
No.
One would have assumed that the DOM would have come in very handy were it to have been used in this application, however there seems to be no mention of it in the scientific paper. It also brings up the issue of "requiring extensive computing power" to perform its functions, something that in the case of loading webpages is no longer an issue.
ExitReality3D
ExitReality3D is a web plugin that automatically generates a 3D world from a 2D website. It works in-browser, and through a drag-and-drop interface allows users to customize their own 3D space.
The controls are thoroughly explained, and the interface is very intuitive. The application feels like one is playing some sort of game rather than browsing the internet.
This is subject to the work done on a particular site. For rooms that are undeveloped, there is a default template which is derived from content found on the page. For developed sites the content present in the room is far richer.
The content in rooms is visually stimulating, which for a site intended for marketing purposes would probably leave a greater impact on its audience than an ordinary website.
Individual users can define the content of their own web space independent of any sites, and can also invite others into their space.
NB: In the case of world-centric browsers, Second Life would also have been a valid contender. However the reason why ExitReality3D was chosen over Second Life was that ExitReality3D is being marketed as a 3D web extension rather than an autonomous virtual world, unlike Second Life.
Webpage Segmentation
As mentioned earlier, the DOM is a vital tool in understanding the content structure and presentation of a webpage. A possible use of the DOM is to 'split' the webpage into a series of distinct visual modules. On visual inspection, the following website would be divided as follows:
C:\Users\Equinox\Desktop\dividedtimes.png
Fig. 1: A typical webpage and its observed divisions. The blue borders indicate distinct visual blocks.
Automating this process could greatly enhance the user's browsing experience, since it would lay the foundations for users to be able to define their own layout for a particular website. There are various algorithms available to solve this problem:
Record-Boundary Discovery
Record-Boundary Discovery (RBD) defines records as being "groups of information relevant to some entity". These records are located inside subtrees, which the algorithm locates. RBD then identifies certain tags that might be separators (i.e. split the subtree into distinct elements), and then chooses a best fit (called a consensus) based on a "combined heuristic".
The heuristics are the following:
Highest-Count takes a tally of the occurrences of each different type of node in the document
Identifiable Separator notes the appearance of tags such as <HR> which serve the purpose of visually dividing subtrees
Standard Deviation uses the observation that when multiple records appear in a document, they are typically the same size. A good candidate tag might be the one with the smallest standard deviation, given the size of the plaintext between identical tags.
Repeating-tag Pattern uses the conjecture that divisions can be created by using the same tag multiple times (eg. Two or more <BR> tags)
Ontology Matching analyses the content of a record, and uses the tendency of certain important pieces of information occurring "once and only once" in a record.
Vision-based Segmentation Algorithm
The Vision-based Page Segmentation (VIPS) algorithm uses visual information obtained from an HTML document's DOM to divide nodes into "Visual Blocks". Each block has a Degree of Coherence (DoC) and for it to classify as a suitable visual block must exceed some Pre-defined Degree of Coherence (PdoC). VIPS is based on a set of heuristics, divided into four categories, which are:
Tag Cues identify blocks containing certain tags such as <HR> as being less coherent than others which do not contain these tags. Thus the DoC of such blocks is decreased, making it more likely for the block to be divided.
Colour Cues identify blocks whose child nodes contain a different background as being less coherent.
Size Cues identify blocks that exceed a certain threshold when compared to their children and the overall size of the document, thus being less coherent.
Text Cues identify blocks that contain solely text nodes, which are more coherent than others.
Function-based Object Model
The Function-based Object Model (FOM) describes the content of a website in terms of functionality instead of semantics. The algorithm divides everything into Basic Objects (BO) and Composite Objects (CO).
A BO has the following properties:
Presentation - how the object is displayed to viewers
Semanteme - the semantic meaning of the object
Decoration - the extent of the 'decorative' role the object plays in the webpage
Hyperlink - what the BO points to
Interaction - can be either 'display' (no interaction), 'button' or 'input'
Rendering Engines
Berkelium
Berkelium is a library that provides off-screen browser rendering using Google's web browser Chromium. It does not interface directly with Webkit, Chromium's rendering engine, but instead is a layer on top of the browser. This allows it to take advantage of the multiprocess rendering offered by Chromium. …
Awesomium
Awesomium is another library that offers off-screen browsing . It is different from Berkelium in that it uses Webkit directly rather than Chromium.
Specification & Design
The purpose of the Specification and Design section is to give the reader a clear picture of the system you have created and why you created it in the way that you did. Describing a software system effectively usually means describing it from more than one viewpoint. Each viewpoint will convey some information about the system that other views omit. (You would use the same technique when describing any complicated construction such as building, an aircraft, a novel or a painting.)
Possible viewpoints might be
the system as experienced by a potential end-user,
the dynamic behavior of the system,
how data flows through the system,
what data types are implemented in the system,
the static architecture of the system, i.e. how the code is partitioned into modules,
etc.
A common approach is to describe first the static architecture, identifying modules and groups of closely connected modules, and then to apply other views to each of these groups. Fine details, specifically details of code, should be left out. Also, any complete rigorous specification (assuming, optimistically, that you have had the time to produce one) is better relegated to an appendix.
We strongly recommend that you make extensive use of diagrams such as EARDs or STDs, or other pictorial techniques (see Section 5.5 for more advice on this). As well as describing the system, it is very important that you justify its design, for example, by discussing the implications of different design choices and then giving reasons for making the choices you did. Typically these implications will relate to the aims of the project and to aspects of it discussed in the Background section.
The design of the system will almost certainly have evolved while you were developing it. Obviously you should describe its final state but often there are good reasons for describing intermediate states too; for example, if you want to discuss details of the design method used. If you do this, take special care to make sure the reader does not get confused between different stages of the design.
Specification
From the research carried out the author has defined the following points of being essential to creating a usable 3D browser application:
Simple to use
Have intuitive controls
Have restricted movement
Contain a space from where all browsing can take place (no need to open new browser instances)
Have separate controls for each browser window
Event handling for each individual browser
The author believes that these specifications represent the core of any content-centric 3D browser. Such an application may then be extended to provide the following functionality:
Dividing webpages visually into a set of distinct modules
Organizing modules as the user desires into some layout
Distributing layouts to other browser users
Having such functionality present in standard browsers would be unfeasible, because the user is restricted to the desktop space in which to manipulate windows. However in an area of seemingly unlimited space, this feature can be exploited to great effect.
Objects needed to be handled:
Browser Information
Apart from the URL of the site, information needs to be stored that relates to its position and size in 3D space. Additional information should also be stored relating to the possible splitting of the website.
Layout
A layout comprises of a number of browser modules organized in some way. It does not make sense to encode menus as well, since they are static items, and can be created on-the-fly when a layout is inserted into the user space.
Design
The solution's design has gone through a number of iterations, each being an improvement on the rest or better conforming to a known design pattern. It is composed of a common 3-tier structure, and shall be covered in detail in the coming sections.
This diagram highlights the data flow through the 3 tiers:
[DFD]
Database
The database is required to store and retrieve the various layouts that users of the browser produce. Instead of storing the actual HTML code, the instructions for dividing the webpage will be stored. This has several advantages, namely saving space and being able to apply a layout over a set of similar webpages, such as an online store or news site.
The following tables show the schema that the database will use:
Layouts
Field Name
Type
layoutId (primary key)
Int
Source
Varchar
layoutCode
Varchar
Likes
Long
Modules
Field Name
Type
moduleId (primary key)
Int
layoutId (foreign key)
Int
positionX
Int
positionY
Int
sizeX
Int
sizeY
Int
vbName
Varchar
Pdoc
Int
The database will also incorporate a number of stored procedures which facilitate content creation and retrieval. These shall be further discussed in the implementation section of the dissertation.
Web Service
The Web Service is responsible for acting as the interface between the browser application and the database. These are the following methods shall be implemented:
GetModules(url, code) - gets the modules from the database with layout matching url and code.
LikeLayout(url, code, isLike) - if isLike is true, increments the like field of layout matching url and code. If false, it decrements.
AddLayout(Layout, url, code) - If the code for that url is unique, adds the new layout to the database with specified url and code.
Browser Application
The suggested pattern for developing WPF applications is MVVM. However there are some known issues with applying databinding to a Viewport3D. In the end, the developer opted for a more clear-cut MVC design. The application is centered around the application's MainWindow, which is bound to a MainController.
Interface Design
The UI of the application went through several iterations, however the following figure represents the final sketch of the design:
Features
The application is intended to have the following features:
Adding/Removing Browser Instances
Having a Drag-Select function
Class Design
The following shows a sketch of the class diagram:
[Class diagram]
WebModule
The WebModule class contains metadata relating to a browser instance. Specifically, it contains:
Field Name
Type
IsFullWindow
Bool
Position
Point
Source
String
Url
String
Size
Size
AdditionalInformation
AdditionalInfo
It is important to note that the field 'Source' contains a separated module's HTML code, even though it is generally taken to mean the URL in other applications. The AdditionalInformation class is a wrapper class containing the two Strings, VbName and PDoc. These are both related to the site separator script, which shall be discussed later in the dissertation. 'IsFullWindow' will determine the type of menu to be used when rendering the browser instance. For the purposes of this proof-of-concept, windows that have already been split up cannot be separated again, although it is theoretically possible to do so.
MainController
The MainController class is responsible for handling window events, as well as designating browsers and shapes to be drawn in the viewport. The class keeps a list of all initialized browsers, and can manipulate them individually. The notable fields in the class are:
Field Name
Type
Browsers
ObservableCollection<BrowserController>
Shapes
ObservableCollection<Viewport2DVisual3D>
Window
MainWindow
Conditions
Conditions
CameraPositions
Point3D?[10]
Trackball
Trackball
BrowserController
The BrowserController houses all the elements relating to a particular browser instance. These are:
Field Name
Type
WebModule
WebModule
WebBrowser
WebBrowser
Viewport2DVisual3D
BrowserVisual
Viewport2DVisual3D
MenuVisual
MeshGeometry3D
BrowserGeometry
MeshGeometry3D
MenuGeometry
MenuController
The MenuController is instantiated in the code-behind of the MenuView, and is responsible for passing on events that have been generated as a result of the user clicking the menu. Its fields are:
Field Name
Type
BrowserController
BrowserController
MenuController
MenuController
Since the MenuController merely broadcasts messages to the BrowserController, there was no need to include a reference to the MenuController in the BrowserController class.
Separator Script
Implementation
The Implementation section is similar to the Specification and Design section in that it describes the system but it does so at a finer level of detail, down to the code level. It can also describe any problems that may have arisen during implementation. Do not attempt to describe all the code in the system, and do not include large pieces of code in this section. Complete source code listings should be put in an appendix (see section 6.6). Instead pick out and describe just the pieces which, for example,
• are especially critical to the operation of the system,
• you feel might be of particular interest to the reader for some reason,
• are exemplary, i.e. they illustrate an algorithm, data structure, etc. that is used widely throughout the system.
You should also mention any unforeseen problems you encountered when implementing the system. Common problems are:
• difficulties becoming familiar with existing software, because of, e.g.
- its complexity,
- lack of documentation,
- lack of suitable supporting software,
- over ambitious project aims.
A seemingly disproportionate amount of project time can be taken up in dealing with such problems. The implementation section gives you the opportunity to show where that time has gone.
Browser Application
Introduction
The foundation used in developing the browser application was WPF. It renders graphical elements using DirectX, and also provides an element called 'Viewport3D' which allows the developer to create 3D objects and display them in the application.
A working example was taken as a starting point, which was built upon by the developer. This is the YouCube 3D browser by Chris Cavanagh. The project contains a 6-sided cube, which has a fully interactive webpage rendered on each of its sides.
The length of the MainWindow's code-behind was kept to a minimum, with all appropriate events being handled from the controller. This was worked around by exposing an ObservableCollection<BrowserController>, and an event hooked to it which fires when the collection changes (items are added or removed). The event then adds or removes the concerned items from the Viewport.
Rendering Engine
Awesomium was picked as the rendering engine of choice. It has a simple interface, and a number of examples in C# which made it easier for the developer to start coding. Initially the idea was to use Berkelium however the developer was not so familiar with the C++ language and was unable to get Berkelium running.
Unfortunately the Awesomium wrapper used dates from 2009. This was used since later versions were impossible to get running, and meant that certain advances in the Awesomium library were not available for use in the application.
Separator Script
Currently no stable implementation of Webkit exists for the .NET framework. This made it problematic for the developer to get started on this part of the project. Although .NET has a DOM API, in the developer's opinion using the IE engine to display webpages in Webkit was not very coherent. Instead, he opted to write the module in Javascript and embed it in the website.
Technically this constitutes a form of cross-site scripting, so to work around this the HTML page is downloaded to disk. In the same folder as the HTML page is the script required to separate the web page into modules. The script is then 'injected' into the webpage by means of string manipulation. The script contains the following snippet of code:
if( window.addEventListener ) {
window.addEventListener('load',VIPSInit,false);
} else if ( document.addEventListener ) {
document.addEventListener('load',VIPSInit,false);
}
This automatically adds an event listener to the webpage, which executes the script when the page loads. An alternative idea was to listen for a 'DOMContentLoaded' event, however this was causing some problems with other scripts that also performed some post-processing on the site's layout.
Apart from dividing the modules, it was also important to change all the relative links, images and scripts inside the webpage to absolute so that it could be closely approximated on disk. This was relatively simple to implement, thanks to the access that the DOM provides.
Database
Web Service
Evaluation
In the Implementation section you should describe to what extent you achieved your goals.
You should describe how you demonstrated that system works as intended (or not, as the case may be). Include comprehensible summaries of the results of all critical tests that you made. You might not have had time to carry out any fully rigorous tests - you may not even have got as far as producing a testable system, but you should try to indicate how confident you are about whatever code you have produced, and also suggest what tests would be required to gain further confidence.
You must also critically evaluate your system in the light of these tests, describing its strengths and weaknesses. Ideas for improving it can be carried over into the Future Work section
General
Browsing Application
Memory leak when windows are created and closed.
The validity of a layout is susceptible to changes in the website (as happened with the timesofmalta website)
Some websites work, some don't. This may or may not be due to the website itself containing some sort of problem
Query strings cannot be handled (yet!)
Rendering Engine
The rendering engine proved to be unstable at times. Also, the engine seems unable of handling things such as Javascript prompts and alerts, which can sometimes inhibit the browsing experience.
Script Separator
Distribution Framework
Future Work
It is quite likely that by the end of your project you will not have achieved all that you planned at the start; and in any case, your ideas will have grown during the course of the project beyond what you could hope to do in the time available. The Future Work section is for expressing your unrealised ideas. It is a way of recording that 'I have thought about this', and it is also a way of stating what you would like to have done if only you had not run out of time. A good Future Work section should provide a starting point for someone else to continue the work which you have begun. Needless to say, a dissertation where the Future Work section takes up most of the main body will be looked upon sympathetically.
Better error handling
More flexible design
Full browser functionality (downloads, etc…)
Full screen browser option
More implementations of splitting algorithms
Method to distinguish when a website changes its general format (use a WISARD?)
Conclusions