|
Integration
of Database and Internet Technologies
The content of many web sites change frequently. Web site
performance, including system up-time and user response time,
is a key differentiation point among companies that are eager
to reach, attract, and keep customers. Therefore, we develop
solutions for integrating Internet services, business logic,
and database technologies, and for improving end-to-end scalability
of e-commerce systems (This work is
done during my visit to NEC USA, CCRL; as a result, I only
summarize the portion of the work which is already
published).
[back]
|
|
| |
Web
performance is a key differentiation among content providers.
Snafus and slowdowns at major Web sites demonstrate the difficulty
that companies face trying to scale to a large amount of Web
traffic. One solution to this problem is to store Web content
at server-side- and edge-caches for fast delivery to the end
users. However, for many e-commerce sites, Web pages are created
dynamically based on the current state of business processes,
represented in application servers and databases. Since application
servers, databases, Web servers, and caches are independent
components, there is no efficient mechanism to make changes
in the database content reflected to the cached Web pages.
As a result, most application servers have to mark dynamically
generated Web pages as non-cacheable. Thus, each user request
for such a page results in additional delay.
We
develop an architectural framework for enabling dynamic content
caching for database-driven e-commerce sites. More specifically,
we develop techniques for intelligently invalidating dynamically
generated Web pages in the caches, thereby enabling caching
of Web pages generated based on database contents. The solution
achieves this by using a bi-layered approach which divides
the problem into two components: sniffing or mapping the relationship
between the Web pages and the underlying queries and, once
the database is updated, invalidating the Web content dependent
on queries that are affected by this update. Consequently,
the proposed architecture consists of two independent components,
a sniffer, which collects information about user requests
and an invalidator, which removes cached pages that are affected
by updates to the underlying data.
Another
related issue is creation and maintenance of useable E-commerce
sites. One important criteria of site usability is ease of
navigation. To improcve the ease of navigation, we first have
to identify the pages that are important, based on users interest
and site structure. Such pages/nodesneeds to satisfy the following
criteria:
- High
connectivity so that users can navigate from these Web
site map nodes to other nodes easily.
- The
contents of these site map nodes need to be representative.
Thus,
the nodes selected need to be both structural and content-wise
representative. Therefore, we develop algorithms that can
identify such representative nodes and use them to guide the
user in site navigation.
[back]
|
|