eRulemaking @ Carnegie Mellon
Notice and comment rulemaking is the process of creating new governmental regulations by proposing new rules, collecting and considering public comments on the rules, and making the final rules. Proposed regulations must be published in draft form, the public must be allowed to comment, and the agency must consider the public's comments before issuing the final regulations. This process is specified under section 553 of the Administrative Procedures Act (APA), which was issued in 1946.
In the past, public comment was submitted to the U.S. federal government primarily in paper form. However during the last several years the government has begun to allow comments to be submitted electronically in some cases. Recently Regulations.gov web site was created to make it easier for citizens to examine and comment on proposed regulations, so the volume of electronic comments is expected to grow rapidly.
The process of soliciting and considerating public comments which are electronically submitted is called "eRulemaking". eRulemaking offers opportunities for the government to reduce its costs and improve the quality of notice and comment rulemaking, but it also poses a variety of new social, political, and technical challenges. This website is created for government agencies and rulewriters to utilize the state-of-the-art information seeking and analytical tools in the eRulemaking process.
The Carnegie Mellon eRulemaking project focuses primarily on a set of technical challenges related to effective use of large amounts of unstructured public commentary. Citizens and government administrators need a variety of navigation aids and analysis tools to help them understand the contents of large public comment databases. These aids and tools include full-text search, Near-Duplicate Detection, Stakeholder Identification, etc. The underlying technologies are primarily Information Retrieval, Text Mining, and Natural Language Processing.
- Near-Duplicate Detection
Near-Duplicate Detection is usually the first tool that agencies use to analyze comments submitted electronically. High profile rulemakings can attract hundreds of thousands of comments, which makes considering every comment time-consuming and labor-intensive. When comment volumes are large, the comments often include form letters that consist of duplicated text, and modified form letters that are a mix of duplicated and unique text. Our DURIAN software assists analysts by identifying and grouping form letters and unique text in modified form letters so that attention can be focused on the unique portions of each comment. The software organizes comments into clusters that can be easily browsed, provides summary statistics about each cluster (e.g., the number of duplicates), and highlights text that was added to or deleted from a form letter by an individual.
- Stakeholder Identification
Another important question for U.S. regulatory agencies as they propose new regulation is "Who cares?". An answer to this question is a stakeholder. A stakeholder is an individual, group, organization, or community that has an interest or stake in a consensus-building process. When new regulation is being considered and debated, the stakeholders are the communities or interest groups that the authors of public comments represent, the groups or communities that will be impacted by a regulation change, and the agencies or government entities involved. Stakeholder identification involves (1) automatically identifying such stakeholder mentions in public comments and (2) organizing the extracted stakeholders into a hierarchical ontology that provides easy navigation and access to documents that mention stakeholders of particular interest to the user.
Copyright © 2007, 2008 Carnegie Mellon University. All Rights Reserved.
Maintained by Jamie Callan and Grace Hui Yang .
Last modified: January 24, 2007. 16:51:06 pm


