Intelligent Navigation
Rating documents
We could take into account:
The author's view
The author could
rate the interest of his documents
(optional).
Most looked at
Each time a document
is looked at, it could increment
a "popularity" value, otherwise decreasing
as time goes by, e.g.
pop = exp(-alpha*(t - tp)),
tp being equal to ln(pop)/alpha +
t_old, t_old being the last time
someone looked at the file or through
the link, and t the current time.
The reader's view
Any reader might
rate the interest of the document,
like "boring" or "interesting" (optional).
The reader should be allowed at any
time to select the weight of each
rating, with a default value that
he could set in a default file.
The best-rated link, as well as any
link over a default interest value
(e.g. the average value for the whole
text, or a constant), should be colored
in a special way.
Search in the Web where no index
is provided
The problem dealt with here is: I
am in an html file, I know what I
am looking for, I know keywords for
it, and I want to see if there is
anything about it available FROM
the current file through its links.
Where to search ?
A breadth-first traversal searching
seems to be the only way, if we don't
want our grand-children to get the
answer for us...
The search might detect if a file
has already been looked through,
and save the results for it.
For best results, after a study of
each link of a given file, the search
should study the links of the best-rated
file, and then the links of the next
best-rated file so far, wherever
it is.
How to rate the interest of a file
?
We could take into account:
- The text
- How many times the keywords
are used in the text.
- The titles
- Each time one of the keywords
is used in a title, this shall increase
very much the interest of a given
file.
- Its own links
- The file should be
given some feedback about the interest
rates of its own linked files, which
might have themselves been corrected
if their own linked files show sufficient
interest, and so on.
- General interest
- general information
such as quoted above would also be
used: the author's rating, the readers'
average rating, the number of readers
having looked through it per unit
of time ...
How long ?
This is of course the most important.
Given an infinite time, a search
can be quite accurate, but will be
very inefficient for the reader.
There are many ways to stop the search:
- File found
- A file that seems enough
related to the keywords is found,
and the reader wants to get right
down to it. The reader would have
to define the limit between "enough
related" and "not enough related"
somehow, or use a default value.
- Depth reached
- The reader could set
a depth as a limit (e.g. the search
should not follow more than three
consecutive links).
- Time over
- The reader could set a
maximum time for search, whatever
other limitations he may use. He
should also be allowed to stop the
search at any time and get the best
result so far.
Then what ?
Once the search is over, the reader
could have two choices:
- To get the best-rated file, and then
by decreasing interest the others.
The best-rated file isn't necessarily
directly linked to the reader's current
file current file.
- To get the best-rated path, which
means that the best link he should
use would be highlighted in some
way at each stage.
The reader should have the possibility
to keep searching, while starting
to read the documents found.
We now see better the difference
between his two choices:
When he only wants the best-rated
files, the reader will have access
to files that won't be much related
one to each other. When he takes
the best-rated path, the user will
follow links that have been created
by a human being in an order that
we may suppose to be logical.
When the reader wants some detail
on a well-known field, he could take
the first search method; when he
needs a somewhat more logical information
on an unknown field, he could take
the second search method.
Increased Speed
Depending on its own possibilities,
the client could dedicate part of
its memory to guess what file(s)
might be asked next by the reader,
and memorize it/them while the reader
would be reading its document. This
should depend upon what amount of
memory is available for it, what
size the documents are, how difficult
is the guess, how blocked is the
network...
The smartest might be to ask for
a transfer of the first page only,
so that the rest of the file could
be transferred while the reader would
read the beginning of it.
AS