Mapping technology chaos

Power engineers are under pressure to develop and deploy new technologies at an ever-quickening pace. The world of power engineering today might best be described as one of technology chaos. Strict new requirements for SO2, NOx, and mercury reduction are here. CO2 capture and sequestration are on the horizon and on the Hill. Then there are efficiency issues, renewable portfolio standards, integrated gasification combined-cycle (IGCC) technology, next-generation nukes, the hydrogen economy, biomass, solar and wind, and so much more. Untold billions of dollars are in play, with the cost of SO2 control alone pegged at $50 billion over the next five years.

In the midst of all this uproar, engineers are supposed to know what is going on and coming on, but the projects and possibilities are legion. Clearly, engineers have a lot of homework to do for their own projects. They also have to avoid being blindsided in a meeting by the latest whiz-bang Wall Street Journal article. Technology assessment in the face of this kind of chaos may seem impossible, but it is also mandatory.

Why is it all so complicated? The Department of Energy’s Office of Scientific and Technical Information (OSTI) is studying this question. OSTI is taking big steps to make it feasible to get one’s arms around the latest science and technology. The OSTI focus is on finding DOE research first, federal research second, and then science and technology around the world, as explained below.
 

Diffusion confusion

How does a power engineer find out what’s happening that’s related to his or her field or project? Why is it so hard?

The basic technology assessment problem can be put in one word—diffusion. The flow of knowledge in science and technology is an extremely complex diffusion process. Much of the complexity is due to two simple processes that overlay one another—convergence and divergence.

Figure 1 is a highly simplified and stylized picture of science and technology diffusion. The sheer complexity of the interrelationships is what makes the concept so hard to grasp, even though the individual relationships may be clear. Not only is the pattern complex, but it represents many possible combinations of divergent and convergent flow over time in the future. The possibilities are not endless, but they are many. They may even be well defined, but the array is structurally complex and virtually impossible to visualize mentally.

1. Science and technology diffusion. The flow of knowledge is like a complex diffusion process. Each lettered box represents a project. Links indicate the potential flow of results from one project to another at a later stage of development, or from left to right. Source: DOE OSTI
 

Each lettered box in Figure 1 represents a project currently being developed. Projects span the spectrum of development, from basic, cutting-edge research to fielding established technologies. The projects included depend on the scale of interest. A project’s focus might be narrow, like corrosion on IGCC turbine blades, or very broad, like clean coal technology. On a broader scale, the projects might represent entire research communities.

Links indicate the potential flow of results from one project to another at a later stage of development, or from left to right. For simplicity’s sake, each project is shown feeding just three downstream projects. Not shown is that fact that new projects may come into being, and existing ones may disappear, as time goes by.

There are a great many link-by-link paths between distant projects. That’s the consequence of divergence and convergence. Results from A may diverge, working their way to C or D, or both. Yet C and D may be very different technologically. Likewise, results from A or B, or both, may converge on C. People working on operational-level power technologies, who are interested in what’s new, tend to look for convergence. But for those seeking to understand how a new basic technology will change the status quo, divergence is mostly what they look at. Trying to look at both at the same time is already very hard, and it’s getting harder.
 

Knowledge from data

In real-world situations, the number of projects involved may number in the hundreds or thousands. For example, OSTI’s research report collection—Information Bridge—lists more than 10,000 project reports related to corrosion. Another 10,000+ are about combustion. Comprehending everything going on in combustion-related corrosion research is clearly a colossal task.

In fact, comprehending everything in any technology discipline is a daunting challenge. A few Google searches no longer suffice. But hope is not lost. Understanding the diffusion of the science and technology related to a specific power engineering issue is like any other engineering problem. You have to scale it to your resources and use the proper tools. In other words, you have to bound the problem or it cannot be solved.

A science and technology assessment problem might include any of the following:

  • The convergence of diffusion pathways to a given technology, problem, or application.
  • The divergence of a particular breakthrough.
  • The neighborhood or cluster of activities related to a specific technology at a specific stage of development.
  • A single diffusion pathway from a specific project to a specific application.

Other combinations of projects and pathways also are possible. In every case, it is critical to limit the search to those projects and links that can be feasibly assessed. The feasibility of understanding is the key concept here. One cannot examine everything.

It is particularly important to distinguish basic research, pilot tests, and the like from real-world applications. This involves what OSTI staffers call the "diffusion distance"—the number of projects and links between projects at different stages of development. Speculation about new science and technology in the general press often misunderstands and understates the great diffusion distance from basic research to actual application. Every stage of development normally takes several years to work through. In particular, basic research breakthroughs often take 10 to 30 years to become useful. Even proven pilot technologies may be a long way from actual application. A feasible assessment may have to simply ignore distant diffusion.

On the other hand, if one has a very specific technical problem, the solution may already exist in a distant research community. Other things being equal, the narrower the technical problem, the more distant the search can be. Once again it is a matter of knowing what can and cannot be done. OSTI is working to increase the efficiency of assessment.
 

The maze of federal R&D

The possible ranges of breadth of assessment or distance of search are exemplified by the U.S. federal government. Much publicly available research is federally funded, so it’s useful to find out about it. The Department of Energy has the specific mandate to fund power-related research, so that is the obvious place to start.

The DOE spends $8 billion a year on R&D, in a confusing array of ways. About $4 billion is for basic research, under the Office of Science (SC). SC funds research related to a vast array of problems, from combustion to cosmology. In fact it funds almost half of federally supported basic research in the physical sciences. The other $4 billion, in applied R&D, is likewise scattered across many mission-specific programs. There are programs for fossil energy, nuclear power, energy efficiency and renewables, and electricity reliability. The DOE also has a large nuclear weapons program, several environmental research programs, and even a major genomics program.

Where the research is done is another confusing array. Much of it is spread among several dozen federal labs and user facilities, sometimes with little apparent reason. Some labs are fairly specialized. For example, the National Energy Technology Lab works mostly, but not entirely, on fossil fuel–related research. The National Renewable Energy Lab of course looks at renewables, but not entirely. Los Alamos National Lab works on nuclear energy projects, often in conjunction with other national labs.

Other facilities—such as Oak Ridge National Lab and the Pacific Northwest National Lab—are multipurpose and may work under any DOE program. Yet other labs are single-purpose, centered on a huge atom smasher or similar instrument. But even here the research may cover a broad array of materials and engineering problems.

The overall federal government R&D program is much larger, and it too is not organized by science, and only roughly by technology. Many departments and agencies fund basic and applied research that is potentially useful to power engineering problems. To bound a science or technology assessment, you may have to restrict the search to a single federal facility, program, office, or department. But before you begin, you have to know how and where to look.
 

OSTI as a technology portal

OSTI and the other federal scientific and technical information agencies have been working hard to find ways to deal with this confusing array of funding programs and research efforts. OSTI began as a filing cabinet in the vast Manhattan Project in the early 1940s. Over time, it grew into the repository for research reports stemming from all DOE-funded research. Obviously, given the DOE’s $8 billion annual R&D budget, a great many reports have come in. Much more has been spent on managing projects than on the paper they produce. People could get copies of reports—generally, by calling the researchers who wrote them. OSTI also put together some compilations and educational materials, but as a minor job. In 1990, OSTI was still basically a warehouse in Tennessee.

The World Wide Web has changed everything. OSTI now offers its vast collection of research results free to anyone with a browser. Moreover, it no longer just processes reports on DOE projects; it finds and makes available reports from around the world that may be of use to DOE-involved researchers and engineers—including power engineers. OSTI now facilitates a global flow of technical information. It has evolved from a filing cabinet to a global communications center.

In addition to DOE reports, OSTI provides access to summaries of projects funded by the DOE and other federal agencies, conference proceedings from many scientific associations, preprints of scholarly articles, university publication sites, and a variety of other collections. Millions of pages are available. All power engineers need to know about OSTI’s work. Many already do, as downloads from its web site exceed 1.5 million a year.

OSTI has been part of a much broader effort as well. It involves broadening search capability to include all U.S. federal research agencies and, ultimately, agencies around the world. The prototype today is www.science.gov. This site supports searches of all major U.S. federal research agency repositories—all the other OSTI-like entities—totaling about 50 million pages of technical content.

Work is under way to expand this concept in two ways. One initiative is the Science.world gateway, where other countries’ collections will be jointly federated along the lines of science.gov. The second is the DOE Science Accelerator, which proposes to make all the available large collections (similar to the conference and preprint collections that OSTI currently offers) available using new search technologies, such as two-way language translation. More on both later.
 

OSTI search tools and collections

Bounding one’s search means knowing which tools are available and what they do and do not cover. OSTI’s specialized tools were developed because Google and the other general-purpose search engines do not cover most of the research document repositories and databases. These myriad document repositories and databases are referred to as the Deep Web. By some estimates, over 90% of all web-accessible technical content is hidden there.

Web crawlers do not reach the Deep Web because document databases in it are only accessible by specific local searches. OSTI has done pioneering work in changing this situation. Many of OSTI’s tools work by first translating a user’s query into a separate local search for each database, and then combining and jointly ranking all the results. This is very different from what a crawler does, and it requires a lot of custom tailoring for each database.

By way of analogy, a similar situation exists in industrial and consumer shopping. Because many product sites are database-driven, one has to go to the site and execute a search to gain entry and shop. In some cases, multiple databases have been federated and even include product ranking by price. OSTI’s combined relevance ranking does the same job for science and technology content.

Following are descriptions of a few of OSTI’s many search tools. Each is highly specialized and requires the user to progress down a learning curve. None of the tools is simple, and all are relatively crude. Google spends over $4 billion a year, including $500 million on R&D. OSTI’s budget is just $8 million, so its tools have no bells and whistles. But they already provide access to more than 10,000,000 pages of research results and technical material, and more are reachable every day.

Information Bridge. This is OSTI’s foundation collection, the filing cabinet of all DOE research reports produced during the last decade. Tens of billions of dollars worth of research are documented here, much of it power-related. Because it is an internal DOE collection, there is also extensive bibliographic information for each entry. This makes it possible to do complex advanced searches using different metadata fields in the document database.

One of the most powerful and useful features in the fielded or advanced search function is the "select subject" button. It brings up a very large semantic structure or word-word link system that can help users find the best technical search terms. The system combines a taxonomy of energy-related words and something akin to a thesaurus. This thesaurus does not provide synonyms but, rather, clusters of terms that are closely related from an engineering point of view. The system includes 30,000 words, about 200,000 word-word relations, and 45,000 taxonomic pathways from broader to narrower concepts. In addition to helping a user pick words as search terms, the system is useful for understanding the conceptual structure of energy science and engineering.

ePrint Network. This is a federated collection of about one million technical articles and related materials found in databases and on the Web. It includes what are called preprints—articles that have not yet appeared in scholarly journals. It also includes the publication web pages of more than 20,000 university faculty as well as many engineering and science departments. This makes it easy to go from a single paper to the whole body of a researcher’s related work.

Science (and engineering) Conference Proceedings. Conference proceedings often precede publication of research results by a year or more. This collection federates 26 large databases. It contains hundreds of thousands of papers and presentations, many from professional societies.

Federal R&D Project Summaries. For government-wide search, this is a federated gateway to individual project summaries from six of the largest research funding agencies. In many cases, the search results include recent awards, which may precede research reports or publications by several years.

Science.gov. Also for government-wide search, www.science.gov is a collaborative portal supported by almost all of the federal R&D agencies, both science and engineering. It federates 30 massive databases, including Information Bridge and its counterparts. Only the top 200 or so documents are returned from each database, so it is most useful for finding out which agency is funding the research of interest. The user can then bore down into the relevant agency database.

Tools of the future

Whereas www.science.gov federates the major U.S. government science and engineering databases, the idea behind the Science.world gateway is to combine the same resources of many different countries. DOE’s undersecretary for science recently signed an agreement with the British Library to launch the initiative.

The idea behind the DOE Science (and engineering) Accelerator is a simple one. OSTI estimates that there are 1,000 or so major document databases in the world. The different collections that OSTI has federated to date, of a few dozen databases each, are small samples of this great universe. Preliminary research indicates that it should be feasible to federate all of the existing databases into one searchable collection. To be sure, there will be serious scaling issues, but federation seems possible.

One of the greatest challenges in the Science Accelerator is dealing with language translation. Although many journals are in English, most research reports, conference proceedings, project summaries, and the like are in the researcher’s native language. It is envisioned that the Accelerator will take a user’s query and translate it into all the languages of interest, do the search, and then translate the results from all the languages found back into the user’s language.

A little-recognized aspect of knowledge diffusion is that educational web content is not just for students. An engineer or scientist investigating a new technical area for a potential solution to a problem often has to enter through an educational door. The Accelerator will be designed to find and identify the less-technical, introductory, or educational content related to a technical subject.

Visualization is another potentially powerful tool for technology assessment. It goes beyond relevance-ranked lists to use maps that make visible the complex structures that underlie science and technology diffusion. OSTI is already doing research on diffusion visualization.

Spreading the word

In addition to developing new tools and expanding their scope, OSTI is researching how the broad diffusion of science and technology works. OSTI staff hope to adapt mathematical contagion modeling, used to predict the spread of disease, to the spread of ideas.

Results to date are very encouraging. The dots in Figure 2 represent the cumulative number of authors who have published research results in the field of carbon nanotubes. The line through the dots is a best-fit curve produced by a mathematical contagion model. The values of the various model parameters are determined by the best fit.

One of the key model parameters is the contact rate. In the case of disease, this is basically the number of people with whom an average infected person comes in contact, spreading the disease. In the case of science and technology, it is the number of new people who adopt the new knowledge and extend the research.

On Figure 2, the line to the left of the dots shows what happens in the model if we double the contact rate: The growth of the research accelerates, by as much as five years. Other cases, from other fields, yield similar results. This diffusion research suggests quite strongly that making it easier for scientists and engineers with specific problems to find and contact one another should accelerate the diffusion of new knowledge. This is precisely what OSTI is trying to do—facilitate contact by helping scientists and engineers find one another.

2. Knowledge is contagious. Knowledge spreads much like a communicable disease. Models like this one make it possible to predict the rate of its spread. The line to the left of the dots shows what happens in the model if we double the contact rate: The growth of the research accelerates, by as much as five years. Source: DOE OSTI
 

Work smarter

OSTI’s goal is an ambitious one: accelerating the spread of ideas and solutions in science and technology, globally and across at least the physical sciences. Because power technology and its supporting science are central to the DOE’s mission, power engineers stand to benefit from this effort.

For power engineers, the trick will be learning to use OSTI’s new tools to map and master the technology chaos facing the power industry. That won’t be easy, but it can be done. Engineers still will have to bound their technology assessments, but better tools mean more will be accomplished by the same effort.

Dr. David Wojick, PE, is a consultant on the diffusion of science and technology. He can be reached at dwojick@hughes.net or 540-858-3503.