PROMULGATING GRADUATE RESEARCH AND DISSEMINATING STUDENT INFORMATION THROUGH WEB APPLICATIONS AND DATABASE MANAGEMENT Li-ching Chien Except where reference is made to the work of others, the work described in this thesis is my own or was done in collaboration with my advisory committee. This thesis does not include proprietary or classified information. ______________________________ Li-ching Chien Certificate of Approval: ___________________________ Juan Gilbert Associate Professor Computer Science and Software Engineering ___________________________ Min-Te Sun Assistant Professor Computer Science and Software Engineering ____________________________ Kai Chang, Chair Professor Computer Science and Software Engineering ____________________________ Stephen L. McFarland Acting Dean Graduate School PROMULGATING GRADUATE RESEARCH AND DISSEMINATING STUDENT INFORMATION THROUGH WEB APPLICATIONS AND DATABASE MANAGEMENT Li-ching Chien A Thesis Submitted to the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Master of Science Auburn, Alabama May 11, 2006 iii PROMULGATING GRADUATE RESEARCH AND DISSEMINATING STUDENT INFORMATION THROUGH WEB APPLICATIONS AND DATABASE MANAGEMENT Li-ching Chien Permission is granted to Auburn University to make copies of this thesis at its discretion, upon request of individuals or institutions and at their expense. The author reserves all publication rights. __________________________ Signature of Author __________________________ Date of Graduation iv VITA Li-ching Chien, daughter of Cheng-hsiung Chien and Yu-jui Huang, was born in Taoyuan, Taiwan, on August 11, 1972. After graduating from Soochow University with a Bachelor of Law in 1995, she earned a Master of Arts in International Administrative Studies with a concentration in business from Ohio University in 1997. In August 2001, she entered the graduate program in the Department of Computer Science and Software Engineering at Auburn University. While in pursuit of the Master of Science degree, she worked as a graduate assistant for the Auburn University Graduate School under Dean Stephen L. McFarland. v THESIS ABSTRACT PROMULGATING GRADUATE RESEARCH AND DISSEMINATING STUDENT INFORMATION THROUGH WEB APPLICATIONS AND DATABASE MANAGEMENT Li-ching Chien Master of Science, May 11, 2006 (Master of Arts., Ohio University, 1997) (Bachelor of Law, Soochow University, 1995) 92 Typed pages Directed by Kai Chang ASP.NET-based Web applications provide a means of promulgating graduate research and disseminating student information. Integrated with MS SQL Server databases, they provide on-demand, near-instantaneous access to information without the need for special user training, directly to users. As a server-side technology, this platform takes advantage of Common Language Runtime, compiled code, Object Oriented Programming, and multiple programming languages. Using actual, functioning university systems, this study discusses the problems of and solutions for security, login mechanisms, data collection, data validation, data storage, search engines, user interfaces, user authentication and authorization, and data distribution. vi Style Manual or Journal used: GUIDE TO PREPARATION AND SUBMISSION OF THESIS AND DISSERTATION 2005 Computer Software used: MS ? Word vii TABLE OF CONTENTS LIST OF FIGURES AND CHARTS 1. INTRODUCTION 1.1 Web-based application & database management 1.2 .NET and ASP.NET 1.2.1 Creating server-side dynamic Web pages 1.2.2 Compiled code and application configuration 1.2.3 Multiple programming language support 1.2.4 Object-Oriented Programming (OOP) 1.3 Microsoft SQL Server 2. AUETD AND GTUFS 2.1 AUETD 2.1.1 Student submission 2.1.2 Graduate School administration 2.1.3 Public access to AUETD 2.2 GTUFS 2.2.1 Student login 2.2.2 Department login 2.2.3 College login 2.2.4 Graduate School login 3. SECURITY AND LOGIN 3.1 Layers of security 3.1.1 Determining access ? first layer of security 3.1.2 Identifying access type or access level ? second layer of security 3.1.3 Collecting data about users from existing databases 3.2 Windows Domain login 3.2.1 ASP.NET authentication 3.2.2 ASP.NET authorization 3.2.3 Example of Windows Domain login mechanism 3.2.4 Conclusion for Windows domain login mechanism x 1 1 2 3 4 5 6 7 8 9 10 12 16 17 19 20 20 22 27 27 27 28 30 32 32 33 34 36 viii 4. COLLECTING THE DATA 4.1 Collecting/uploading data from users through ASP.NET Web forms 4.2 File upload 4.2.1 Types of file upload 4.2.2 Upload file formats 4.3 Data Validation 4.3.1 Types of data validation ? client side vs. server side 4.3.2 What the ASP.NET data validation controls do 5. STORING THE DATA 5.1 Storing data with ADO.NET ? the interface between the Programming languages of Web applications and databases 5.2 Database systems 5.2.1 Database structure in AUETD 5.2.2 Database structure in GTUFS 5.3 Selected issues in storing the data 6. DISTRIBUTING THE DATA 6.1 Distributing the various types of data through Web applications 6.1.1 Non-restricted information ? public information 6.1.2 Restricted information ? private information or public information with restricted access 6.2 Search engine in AUETD 6.2.1 Search function for general user 6.2.1.1 Search by KEYWORD 6.2.1.2 Search by AUTHOR, MAJOR PROFESSOR, TITLE 6.2.1.3 Search by DEPARTMENT or YEAR 6.2.1.4 Search results 6.2.1.5 Interface design considerations for the general search function 6.2.2 Search function for AUETD administrators 6.2.2.1 Search criteria 6.2.2.2 Search results 6.2.2.3 Interface design considerations for the administrator search function 6.2.3 Conclusions about the AUETD search engine 7. CONCLUSION 38 38 39 39 42 44 44 45 47 48 51 51 52 55 58 58 58 59 61 61 61 65 66 67 69 73 73 74 76 76 77 ix REFERENCES BIBLIOGRAPHY 79 80 x LIST OF FIGURES AND CHARTS Fig. 1.1 The major components of the .NET framework Fig. 1.2 Creating server-side dynamic Web pages Fig. 2.1 The home page of AUETD Fig. 2.2 AUETD student login page Fig 2.3 AUETD student profile page Fig 2.4 AUETD upload page Fig. 2.5 AUETD confirmation page Fig. 2.6 AUETD pending approval page Fig. 2.7 AUETD approval page Fig. 2.8 AUETD administrative search page Fig. 2.9 AUETD add new ETD page Fig. 2.10 AUETD UMI page Fig. 2.11 AUETD public search page Fig. 2.12 GTUFS home page Fig. 2.13 GTUFS student page Fig. 2.14 GTUFS department page Fig. 2.15 GTUFS college page Fig. 2.16 GTUFS college GRFs nomination page 2 3 9 11 11 12 12 13 13 14 14 15 16 17 19 20 21 22 xi Fig. 2.17 GTUFS Graduate School page Fig. 2.18 GTUFS data import page Fig. 2.19 GTUFS update record page Fig. 2.20 GTUFS GRFs processing page Fig. 2.21 GTUFS low GPA page Fig. 3.1 The login page for student submission Fig. 3.2 The login error page for student submission Fig. 5.1 The three major components in ADO.NET Fig. 5.2 The OleDbClient example in the GTUFS database update process Fig. 5.3 The SQLClient example in the GTUFS database update process Fig. 6.1 Stopwords example Fig. 6.2 The search result page of AUETD Fig. 6.3 The ETD search page of Chung Yuan Christian University Fig. 6.4 The first page of AUETD Fig. 6.5 The search page of AUETD Fig. 6.6 The administrator search page of AUETD Fig. 6.7 The administrator search result page Fig. 6.8 The administrator edit/approve page Chart 2.1 GTUFS architecture 22 23 24 25 25 36 36 48 49 50 65 69 70 72 73 74 75 75 18 1 CHAPTER 1 INTRODUCTION 1.1 Web-based application & database management Functionality-oriented Web applications are emerging on the Internet to compete with e-mails and Web pages. They provide personal and exclusive information customized for the individual user. They allow users without specialized training to interact with information systems and with other users by transmitting information and data when required or on demand. Microsoft defines a Web Application as a software program that uses HTTP for its core communication protocol and delivers Web-based information to the user in the HTML language. Such applications are also called Web-based applications. Although a custom-built client for such applications is possible, most applications will leverage an existing Web browser client such as Internet Explorer, Mozilla FireFox, Netscape Navigator, or Opera. [1] For collecting, storing and distributing data, Web applications are primarily database applications by which Web pages are created dynamically. Such dynamic creations provide users a current and up-to-date view of information stored in the database. 2 Different technologies can be used for developing database-driven Web applications as a form of interactive media. Microsoft ASP.NET technology is chosen for this study because of the features it offers in efficiently generating dynamic content for users. Microsoft SQL Server 2000 is chosen to be the back-end database engine because of its integration into ASP.NET and its capabilities of satisfying the performance, ease-of-use, and standard compliance requirements expected from a database system for use on the Web. 1.2 .NET and ASP.NET Fig. 1.1 The major components of the .NET framework ASP.NET is a powerful server-side technology for creating dynamic Web pages. Developed by Microsoft, it is an Internet-based Web application development platform. It is fully integrated into the .NET framework, which enables the developer to take .NET Framework ASP.NET Web Application Services Windows Application Services .NET Framework Base Classes (Object-oriented) Common Language Runtime (CLR) 3 advantage of Common Language Runtime (CLR) 1 , inheritance and other features of that platform. Before ASP.NET, the most popular Web development tool provided by Microsoft has been Active Server Pages (ASP), which made the development of database-enabled Web sites easy. However, ASP commonly relies on either JavaScript or VBScript scripting languages to create dynamic Web pages, which adds complexity and exposes the system to variable client browser settings that would interfere with the proper functioning of the script. It also lacks advanced features such as compiled code, Object Oriented Programming, and multiple programming language support. ASP.NET resolves these limitations. 1.2.1 Creating server-side dynamic Web pages Fig. 1.2 Creating server-side dynamic Web pages 1 As part of Microsoft?s .NET Framework, the Common Language Runtime (CLR) is a programming environment that manages the execution of programs written in any of several supported languages, allowing them to share common object-oriented classes written in any of the languages. For further information, see http://searchvb.techtarget.com/sDefinition/0,,sid8_gci860097,00.html 4 In ASP.NET, a Web page is generated dynamically upon request, using a server- based model for HTML controls. Unlike a static Web page that essentially consists of HTML code and has limited capabilities to display personalized and timely information, a dynamic Web page is created in real time, often with reference to a database query based on input from the user. To generate a server-side dynamic Web page, as demonstrated in Fig. 1.2, a Web author writes a set of instructions for creating HTML, and saves the instruction files on the Web server. Then a user makes a request to the Web server through the browser, and the server follows the instructions to create a stream of HTML, sending the stream back to the browser. Finally the browser processes the HTML and displays the Web page [2]. All the processing is done on the server before the Web page is rendered to the browser for display. Compared to client-side technologies, server-side technologies reduce Web page download time, hide programming logic from the public, offer cross-browser capabilities because the browsers are not required to interpret instructions, and more easily incorporate into server-side resources (such as databases). 1.2.2 Compiled code and application configuration There are two types of compilation that translate the code in human language into machine code: interpreted code and compiled code. Unlike interpreted code that is translated when executed, compiled code is compiled and ready to be executed before users need it. It provides faster execution because the compiler has the chance to examine the full set of code and to optimize it. 5 ASP.NET Web pages are compiled on the server and therefore they perform more efficiently than Web pages generated by interpreted code technologies. In ASP.NET, configuration information is stored in XML-based Web.Config files, which can be accessed and edited by Web application developers. Developers therefore no longer need to wait for system administrators to change IIS (Internet Information Services) settings on the server and reboot the server for the settings to take effect. 1.2.3 Multiple programming language support Compared to most technologies that are limited to scripting engines, the .NET framework inherently supports multiple languages. .NET applications run in Common Language Runtime (CLR), which includes a common system of data types that make cross-language inheritance possible. Therefore, the scripting of an ASP.NET Web Form can be done in any of .NET-supported development languages such as C#, Managed C++, Visual Basic.NET, JScript.NET, or others [3]. Multiple language support is not limited only to what is available, but also to how it is used. CLR allows ASP.NET to use one language to subclass a class that was or can be implemented in another language, using metadata 2 to manage all the interfaces and calling conventions between languages. Cross-language inheritance in .NET opens up an architectural option in which mixed language programming teams become more feasible than before. 2 Metadata is defined as ?data about data.? It is the key to the simpler programming model that the CLR supports. The metadata is generated by a compiler and stored automatically in an EXE or DLL. It is in binary, but the framework offers an API to export metadata to and from an XML schema or a COM type library [4]. 6 1.2.4 Object-Oriented Programming (OOP) The techniques of object-oriented programming were developed to solve the problems of managing ever larger and more complex programs. The code reusability in OOP provides a clear modular structure for programs, makes it easy to maintain and modify existing code, and offers a good framework for code libraries. The exact meaning of ?object-oriented programming? depends on the term?s context. Most programmers agree that OOP involves at least the partitioning of a program into classes, which combines logically related sets of data with the functions that act on that data [5]. In .NET, the CLR provides access to common base functionality, such as string searching, to all languages via the Base Class Library. All the functionality of the CLR and the class libraries built on top of it are exposed as methods of objects. Namespaces are used by .NET to group classes in functionally similar groups. System.Drawing, for example, is a namespace that contains the group of functions for drawing. The most useful objected-oriented techniques provided by .NET are inheritance and object constructors [5]. In .NET, the inheritance technique is implemented by deriving classes from ?base classes? or ?parent classes,? which are object classes that offer generic functionality. The .NET framework uses inheritance to provide system functionality. All objects in the .NET system are derived from System.Object or from any other classes derived directly or indirectly from System.Object. Most classes in .NET can serve as derivative parent classes by using the keyword ?Inherits? in the code. Exceptions include a few complicated and fragile classes that are marked as ?NotInheritable,? such as System.String. 7 The object constructor function is used to initialize the object of a class. If a class contains a constructor, an object created by that class will be initialized automatically. In .NET, constructors are created by adding a Sub procedure named ?New? to a class. 1.3 Microsoft SQL Server Microsoft SQL Server is capable of satisfying the performance, ease of use, and standard compliance requirements that are expected from a database system for use on the Web. Moreover, it has an optimized ADO.NET provider called SqlClient in addition to the OLE DB provider, which incorporates well with the technology chosen in this study - ASP.NET. The latest version of SQL Server (2005) includes .NET Common Language Runtime (CLR) integration, which provides a mechanism to take advantage of .NET Base class libraries and the .NET runtime. The .NET CLR performs computationally intensive tasks. The SQL Database engine fetches, stores, and manipulates data. The integration of these two technologies offers an efficient and effective option for developing Web applications and managing databases. 8 CHAPTER 2 AUETD AND GTUFS Using Web applications backed by databases can revolutionize the promulgation of graduate research and the dissemination of student information at universities. Previously faculty and students were dependent on generally inaccessible paper forms, bound volumes, microfilms, and paper reports that were obsolescent at the time they were printed. Electronic applications can eliminate the expense, complexity, labor, and expired data of such physical media, substituting constantly and continuously updated information available anywhere in the world through the use of a Web browser. The Auburn University Electronic Thesis and Dissertation (AUETD) system was created to allow graduate students to publish the results of their research. The Graduate TUition Fellowship System (GTUFS) was created to distribute up-to-date, accurate information about tuition fellowship eligibility. These two systems will be used to illustrate and explain the discussion of Web applications and the management of databases. 9 2.1 AUETD - Auburn University Electronic Thesis and Dissertation (http://graduate.auburn.edu/auetd ? Fig. 2.1) Fig. 2.1 The home page of AUETD A fundamental tradition of graduate education is the requirement that all degree- earning students publish the results of their research. Since the development of moveable character in China during the eleventh century and in Europe in the fifteenth century, research has been promulgated through printed media?books and journals. The development of modern databases and the World Wide Web have offered the opportunity to move these research results to a new medium. At Auburn University, graduation has traditionally required the expensive and time-consuming production of a minimum of three copies of the master?s thesis and four copies of the doctoral dissertation on expensive 25 percent cotton bond paper to qualify for graduation. More typically, students will require more than the minimum number of copies--the record in recent history was 28 copies of a dissertation?because of the need to distribute their work to colleagues, publishers, potential employers, family, and 10 friends. A copy had to be shipped to University Microfilms International (UMI)/ProQuest to satisfy the requirement that students publish their research results. The Auburn University library also has required a copy for use by its patrons. The whole process consumed time, labor, money, material, and space because it created significant costs in printing, binding, shipping, storing, and public accessing. The creation of AUETD solves the above issues relating to the process of submitting and publishing theses and dissertations. There are three major components of AUETD ? student submission, Graduate School administration, and public access. 2.1.1 Student submission Student submission is simple and fast. After clicking the ?Submit ETD? button from the AUETD home page (Fig. 2.1), the student login page (Fig. 2.2) is displayed with simple instructions for students to submit their work to AUETD. Access is controlled by the Auburn University Windows domain login mechanism. Only Auburn University graduate students will be authenticated to AUETD login. After login, students enter only their basic graduation information because behind the scenes their biographic data has already been collected by a process that matches the login user ID with central university databases (Fig. 2.3). Students can choose to restrict access to their work up to three years if concerned about copyright, patent, or publication. Students are given a chance to examine the information they entered before uploading their work (Fig. 2.4). A confirmation page is then shown and an e-mail sent to the students after their theses or dissertations are successfully uploaded (Fig. 2.5). 11 Fig. 2.2 AUETD student login page Fig 2.3 AUETD student profile page F 12 Fig 2.4 AUETD upload page Fig. 2.5 AUETD confirmation page 2.1.2 Graduate School administration After theses and dissertations are uploaded by students, they appear on the pending list (Fig. 2.6) for the Graduate School Thesis and Dissertation Advisor to review and approve (Fig. 2.7). Automatic e-mails are sent to the students when their ETDs are 13 approved. The approved ETDs are instantly available for search and viewing by the public, subject to the access restrictions the authors requested. Fig. 2.6 AUETD pending approval page Fig. 2.7 AUETD approval page 14 The Graduate School Thesis and Dissertation Advisor can search for individual students or certain groups of students in the AUETD database (Fig. 2.8). Fig. 2.8 AUETD administrative search page In some special cases, students are not able to upload their theses or dissertations online. The Thesis and Dissertation Advisor can help students submit their work in such cases. After entering the student?s Auburn Global ID (Fig. 2.9), the advisor receives the student?s profile page (Fig. 2.3), including pre-populated data imported from central databases. From there, the submission procedure duplicates student submission described in section 2.1.1 above. Students receive automatic confirmation e-mails after their ETDs are submitted by the advisor. Fig. 2.9 AUETD add new ETD page 15 Beginning in 1928, the rising number of theses and dissertations requiring publication to satisfy degree requirements and the rising costs of publication encouraged the formation of University Microfilms International (UMI) at the University of Michigan. Universities quickly began substituting publication on microfilm at UMI for the more traditional forms of publication in journals and in book form. UMI was purchased by Bell & Howell in 1985 and in recognition of an increasing use of digital data, changed its name to ProQuest in 2001. Theses and dissertations submitted on paper for conversion to microfilm came under attack in the digital age because of the associated costs, inconvenience, slowness, complexity, and other obstacles to quick and accurate distribution of the information they contain. Beginning in 1987, UMI/ProQuest joined with Virginia Tech University to organize and promote the concept of electronic thesis and dissertation publication as a solution to the above issues. In 2005, UMI/ProQuest began accepting electronic theses and dissertations in PDF format. This is one of the main forces behind the creation of AUETD at Auburn University. Beginning Summer Semester 2005, all approved electronic dissertations at Auburn University are submitted to UMI via FTP (Fig. 2.10). Fig. 2.10 AUETD UMI page 16 2.1.3 Public access to AUETD All approved electronic theses and dissertations can be listed and searched by the public. The search engine is available on the home page of AUETD (Fig. 2.1) or by clicking the ?Search AUETD? button on the AUETD banner, which is displayed on every Web page of AUETD (Fig. 2.11). AUETD can be searched by TITLE, AUTHOR, KEYWORD, DEPARTMENT, MAJOR PROFESSOR, and GRADUATION YEAR. The AUETD search engine will be described and explained in Chapter 5, Section 2. Fig. 2.11 AUETD public search page 17 2.2 GTUFS ? Graduate Tuition Fellowship System (http://graduate.auburn.edu/gtufs - Fig. 2.12) Fig. 2.12 GTUFS home page Paper forms and reports overwhelm bureaucracies and administrative offices. They are cumbersome restrictions on the proper and accurate flow of information, requiring storage, archiving, handling, distribution, updating, correction, and security. References to ?mountains of paper? associated with such offices are quite common. Administering Auburn University?s tuition fellowship programs created a unique challenge involving complex rules of eligibility. This required a constant transfer of information from central databases accessible only with great difficulty to users through a slow and tedious process involving paper forms, paper reports, and multiple telephone calls. In 1998 Auburn University began awarding tuition fellowships to a subset of its graduate assistants. The continued expansion of this program incurred substantial costs, forcing restrictions on eligibility. For Spring Semester 2004, to insure the continued financial viability of the tuition fellowship program, university leaders established 18 eligibility limits of 40 hours of tuition support for master?s and 80 hours of tuition support for doctoral students along with minimum academic performance requirements. Initial confusion stemmed from a ?grandfather? clause that exempted previous recipients. Additional confusion about who was eligible and who was not, who had exhausted his or her eligibility and who had not, and who had met minimum GPA requirements demanded an information technology solution. The creation of GTUFS solved the above issues relating to the process of awarding graduate tuition fellowships. There are four levels to GTUFS: student, department, college, and Graduate School (Chart 2.1). Students may research their fellowship history and eligibility. Departments may examine these records for their majors, while colleges may examine the fellowship records of all departments affiliated with those colleges. Student, department, and college access is limited to information pulled from the database, though colleges also use GTUFS to upload nomination lists to the database. The Graduate School serves as the administrator, controlling all aspects of the process. GRAD SCHL COL1 COL2 COL3 COL4 DEPT1 DEPT2 DEPT3 DEPT4 Student1 Student2 Student3 Student4 Chart. 2.1 GTUFS architecture 19 2.2.1 Student login (Fig. 2.13) GTUFS keeps track of the number of credit hours and semesters for which a student is eligible for tuition fellowships, the number of credit hours and semesters of tuition the university has paid for the student, the number of eligible hours or semesters of tuition fellowships that remain for the student, and if there is an exception that would apply to the student?s situation. A student can also review a detailed tuition support history listed by semesters. If the student has exhausted his or her eligibility for a tuition fellowship, lost eligibility because of poor academic performance, or received an exception from the Graduate School, a comment is displayed on the student page along with an automatic e-mail sent to the student from GTUFS. Fig. 2.13 GTUFS student page 20 2.2.2 Department login (Fig. 2.14) Departments can access tuition fellowship information for those students enrolled in their departments. Student information can be traced back to 1998 when Auburn University first offered graduate tuition fellowships. Departments can select information for a specific term or all available terms, with the current semester as the default. Students? eligibility for tuition fellowships is displayed on the list by showing their remaining semesters or hours. Departments can also view detailed tuition fellowship information for individual students. A search function allows access to any departmental fellowship recipients. Fig. 2.14 GTUFS department page 2.2.3 College login (Fig. 2.15) Colleges can access tuition fellowship information for those students whose majors belong to their colleges. As stated earlier, student information can be traced to 1998. Colleges can select information for a specific term or all available terms, with the current semester as the default. Student eligibility is displayed showing remaining 21 semesters or hours. Colleges can also view detailed tuition fellowship information for individual students. Until Summer Semester 2005, Graduate Research Assistants had to be nominated by their departments to receive tuition fellowships. Colleges collected these nominees from departments and then submitted their nomination lists to the Graduate School through GTUFS (Fig. 2.16). The College of Engineering, for instance, had an allocation of 206. To reduce the time the college staff needed to enter 206 records, GTUFS supplied an initial list from the database comprised of recipients from the previous semester. Staff members only needed to modify the list by removing or adding students. When students were added to the nomination list, GTUFS automatically sorted them into three categories: enrolled in the college, not enrolled in the college, and new students who had no previous GTUFS record. The number of fellowships remaining in the college?s allocation was updated continuously as the college staff added and removed nominations. Once a college?s allocation was exhausted, GTUFS disabled the ?add student function.? A search function is available to provide access to any college fellowship recipients. Fig. 2.15 GTUFS college page 22 Fig. 2.16 GTUFS college GRFs nomination page 2.2.4 Graduate School login (Fig. 2.17) Fig. 2.17 GTUFS Graduate School page The role of the Graduate School is to administer the process of tuition fellowships. There are four major functions involved in Graduate School administration. 23 1. Import and process data Auburn University tuition information is maintained in OASIS - the Online Auburn Student Information System. Data is stored in an IBM mainframe database system with restricted access, and the Graduate School does not have access rights. A download program, requested by the Graduate School and written by a mainframe programmer, is used to retrieve data identified by the Graduate School and then exported to an Excel spreadsheet. The spreadsheet is downloaded, filtered, sorted, and uploaded to the GTUFS database in Microsoft SQL Server (Fig. 2.18). The upload process involves a series of data calculations and data transformations that will be discussed in Chapter 4. The upload process is generally done twice a week, following the university?s tuition calculation process in OASIS. Fig. 2.18 GTUFS data import page 2. Update individual student record Student eligibility for tuition fellowships is determined by degree information at the time records are uploaded, including any special restrictions imposed by administrators. When student degree information is changed or administrative decisions 24 are added, data such as the fellowship end semester, administrative allowable hours, or comments must be manually updated (Fig. 2.19). Fig. 2.19 GTUFS update record page 3. Fellowship eligibility processing Graduate Teaching Assistants (GTAs), Graduate Research Assistants (GRAs), Graduate Assistants (GA), and Graduate Extension Assistants (GEAs) are subject to different rules for receiving tuition fellowships. Students who first enrolled before Spring Semester 2004 and students who first enrolled Spring Semester 2004 and after are also subject to different rules. GTUFS offers a fellowship eligibility processing function that identifies students who have exhausted their eligibility, updates the student records to stop additional awards, and sends warning e-mails (Fig. 2.20). Student GPAs are also imported, recorded, and tracked in GTUFS. GPA warnings or GPA suspensions are assigned to students performing poorly. GTUFS 25 academic records and comments are updated automatically. Warning e-mails are sent to students, their department heads, and their Graduate Program Officers (Fig. 2.21). Fig. 2.20 GTUFS GRFs processing page Fig. 2.21 GTUFS low GPA page 26 4. Query and display data The Graduate School staff can query tuition fellowship data by colleges, departments, or individual students. A search by Auburn User ID is also available from every page inside the GTUFS administrative component. 27 CHAPTER 3 SECURITY AND LOGIN 3.1 Layers of security System login serves three purposes. First, it determines if users have permission to access the system. Second, it sets level or type of authorized user. Third, it allows the system to collect needed information about users from existing databases. 3.1.1 Determining access ? first layer of security Authentication and authorization are important elements of developing secure sites. There are several authentication and authorization mechanisms available in ASP.NET. Both AUETD and GTUFS use a web.config file and Windows Domain login to authenticate and authorize the users against the Auburn domain. Only Auburn University students, faculty, and staff are allowed to access the secure Web pages in AUETD and GTUFS. After users enter their user names and passwords, the login information will be searched against the Auburn domain, which contains the account information of all students, faculty, and staff, and is maintained by the Auburn University Office of Information Technology. If the user name is not on the Auburn domain or the password does not associate with the legal user name, the user will be denied access to the Web pages defined as ?authorized access only? in the web.config file. 28 Using Auburn domain to validate the login, instead of defining the user list in the code or another separate file, benefits users, the Graduate School, and Auburn University. First, it is convenient to users. Users do not need to create and remember another set of login information. They use the same user names and passwords as used to check their Auburn University e-mail or to login to other Auburn University systems. Second, the accuracy of the account information is guaranteed to the Graduate School. The account information is maintained, updated, and secured by the Auburn University Office of Information Technology (OIT). If students, faculty, or staff leave the university for a certain period of time, their accounts would be removed from Auburn Domain accordingly. It simplifies the process for the Graduate School, which does not have to devote resources to authentication. Third, it standardizes the login process for the university, thereby reducing the cost of administering user information. It guarantees greater security from university-established password rules, which require minimum length and character combinations. 3.1.2 Identifying access type or access level ? second layer of security After the login information is authenticated and authorized through Auburn domain, AUETD or GTUFS takes the second step to secure the information by making sure that the user accesses the appropriate part of the system. Both systems use MS SQL Sever tables to identify the user?s status and access level. There are four types of users in AUETD and GTUFS: graduate students, departments, colleges, and Graduate School staff and administrators. 29 1. Graduate students The SCTPROD database contains the selected information from the Auburn University Student Information System (OASIS) and Human Resource System (HRS), which store personal data such as user ID and primary job code. AUETD and GTUFS take the login user ID to match the user ID in the SCTPROD database. When the user ID is found in the database, the systems check the associated student status to see if the user is a graduate student at Auburn University. Only Auburn University graduate students can be authorized to view and process the student portion of the systems. SCTPROD is updated and maintained weekly by the Auburn University Office of Information Technology (OIT) by a batch job. The data is pulled from the Student Information System (OASIS) and the Human Resource System (HRS), insuring accuracy and currency. 2. Departmental access The WAIVERS database is designed and created for GTUFS. Its DEPARTMENT table contains department codes, department names, the associated college codes, and authorized user IDs. Access is therefore controlled through an MS SQL Server table. If the same user ID appears with more than one department code, the user would be able to see multiple departments? records. 3. College access The COLLEGE table in the WAIVERS database contains college codes, college names, tuition fellowship quotas, and authorized user IDs. By joining the COLLEGE 30 table and the DEPARMENT table on college codes, the authorized users can see the student records in all departments of a college. 4. Graduate School staff and administrators In GTUFS, the staff and administrators list is stored in the ADMIN table. There are two kinds of administrative access in GTUFS. One is ?view only,? allowing staff to search, browse, and view student, department, or college records. The other is ?administrator access,? allowing the importation, updating, processing, and modification of data. In the AUETD database, the ADMINISTRATOR table has a list of users who can review, change, approve, or disapprove the theses or dissertations submitted by students. 3.1.3 Collecting data about users from existing databases Windows domain logins allow the systems to collect needed information about users from existing databases. Only Auburn University students, staff, and faculty can be authenticated and authorized to access GTUFS and AUETD. And because their personal and employment information has been previously stored and updated in university systems like OASIS and HRS, it would be inefficient and unnecessary to re-collect this existing information. a. Supporting database - SCTPROD The GID table contains data such as Auburn Global ID (GID), name, social security number, address, phone, and e-mail address. The HRS_EMP table has 31 information such as employment status, employment date, and primary job code. The login user ID is matched with the Auburn Global ID in the GID table. The HRS_EMP table does not contain the user ID. Because its primary key is the social security number, the HRS_EMP table must be joined with the GID table. b. Advantages 1. Efficiency: With the existing data about the login user available in such databases, the systems do not need to collect that information from users again. Users save time and program developers save the steps to maintain and update this basic information. 2. Accuracy of data: The data from the existing databases is maintained and updated by various Auburn University information systems. The accuracy of the data is therefore improved by multiple offices and users. It avoids the situation where users input misspelled names or incorrect departments. Users also avoid questions for which they have no answers, such as their primary job code. 3. Centralization of data: If peripheral or satellite systems use data from the central database, updates and changes only need to be done in the central systems. 4. User friendly interface: Users tend to have greater confidence in systems that recognize them by displaying their personal information such as names and e-mail address after login. It makes users feel that they are interacting with an intelligent system processing individualized knowledge of them. 32 c. Limitations Although much information about the individuals at Auburn University is stored in the Student Information System (OASIS) and Human Resource System (HRS), there is other data that needs to be uploaded or imported to the peripheral systems by users in order to achieve their specific purposes, such as faculty names or length of the thesis or dissertation. 3.2 Windows Domain login Auburn Windows Domain refers to computers on the Auburn University campus that utilize Microsoft?s Active Directory Services 3 . All Auburn University students, faculty, and staff have Auburn Windows Domain accounts. 3.2.1 ASP.NET authentication ASP.NET implements additional authentication schemes using authentication providers, which are separate from and apply only after the IIS authentication schemes 4 . ASP.NET supports the following authentication providers: Windows, Forms, Passport, and None. AUETD and GTUFS use the Windows authentication provider, which relies 3 ?Active Directory is an essential and inseparable part of the Windows 2000 network architecture that improves on the domain architecture of the Windows NT ? 4.0 operating system to provide a directory service designed for distributed networking environments.? For more information, see http://www.microsoft.com/windows2000/server/evaluation/features/dirlist.asp#heading3 4 An important part of many distributed applications is the ability to identify someone, known as a principal or client, and to control the client's access to resources. IIS authentication is the act of validating a client's identity. Generally, clients must present some form of evidence, known as credentials, proving who they are for authentication. Typically, credentials include a username/password pair. For more information, see http://msdn.microsoft.com/library/default.asp?url=/library/en- us/vsent7/html/vxconASPNETAuthentication.asp 33 upon IIS to perform the required authentication of a client. After IIS authenticates a client, it passes a security token to ASP.NET. ASP.NET then constructs and attaches an object of the WindowsPrinciple class to the application context based on the security token it receives from IIS. This allows code to check the Windows group membership of a Windows user. The advantage of using Windows authentication through Windows Domain accounts is there is no need to write custom authentication code [Microsoft MSDN library, 2003]. 3.2.2 ASP.NET authorization After authenticating a principle, the next step is to determine whether that principle has permission to access the resources it is requesting. This process is known as authorization. AUETD and GTUFS use configuration files, machine.config and web.config, to drive authorization information. All ASP.NET applications on a particular server inherit their security configuration from machine.config. Each ASP.NET application can in turn override some of the settings in machine.config using the application-level configuration file, web.config [Microsoft MSDN library, 2003]. In the web.config files of AUETD and GTUFS, the authorization process first denies access to users without Windows Domain accounts. 34 The process then overrides the above authorization by allowing some pages to be accessed by the general public. For example, the first page of AUETD (default.aspx) is accessible to the world: 3.2.3 Example of Windows Domain login mechanism Using AUETD student login as an example, only Auburn University graduate students are allowed to submit ETDs on AUETD. After the student clicks the login button on login.aspx, the Windows Domain login window pops up to prompt the student to enter his or her login ID and Windows Domain password (Fig. 3.1). After the student (the client) is authenticated and authorized to access AUETD resources through the Windows Domain, the page is redirected to createsession_stud.aspx where AUETD checks if the user is classified as an Auburn graduate student in the database. The ASP.NET code behind createsession_stud.aspx extracts the user login ID from the User.Indentity.Name namespace, and uses it as the search condition in the SQL statement to retrieve the user?s information from the GID table, which resides in the SCTPROD database. The SCTPROD database contains students, staff, and faculty information, and extracts data from the Human Resource System (HRS), Student Information System (OASIS), and Global ID System (GID). SCTPROD is synchronized weekly with the Auburn University HRS, OASIS, and GID systems. The GID table stores the Auburn Global ID (GID), Auburn identification number (IDNUM), department 35 (DEPARTMENT), name (NAME_FIRST, NAME_MIDDLE, NAME_LAST), and other information for each student registered at Auburn University. Searching through the GID table, AUETD identifies the status of the user and queries the information needed in AUETD, with the condition that the user?s class has to be at the graduate level: MST, EDS, EDD, PHD, GPR, or GND. The query is as follows: select IDNUM, Name_Last, Name_First, Name_middle from SCTPROD.dbo.GID where gid= ? & ?'" & (strlogin) & "'" & " and (STU_CLASS = 'MST' or STU_CLASS = 'EDS' or STU_CLASS = 'EDD' or STU_CLASS = 'PHD' or STU_CLASS = 'GPR' or STU_CLASS = 'GND') If the query returns no result, the login ID is not identified as a graduate student at Auburn University and the authorization attempt fails. The program then redirects the Web page to createsession_stud.aspx with an ?Authorization Inadequate? message (Fig. 3.2). If the user is identified as a graduate student, the query returns the user?s identification number, first name, middle name, and last name. AUETD saves the information in session variables that can be cross-referenced within the AUETD domain, and then redirects the page to infor.aspx where students provide the remaining information required for submitting their work. The student?s full name and Auburn e- mail address, which have been retrieved from the GID table, are shown at the top of the submission pages. In terms of user interface, displaying the name and e-mail address make users realize that AUETD has recognized them and gathered their information by their login IDs. The submission page therefore becomes a semi-customized Web form for a particular student, not just a blank form for anyone. 36 Fig. 3.1 The login page for student submission Fig. 3.2 The login error page for student submission 3.2.4 Conclusion for Windows domain login mechanism Adopting Microsoft Windows Domain takes advantage of the services provided by both Active Directory and NT Directory Services. Users login to AUETD and GTUFS with domain credentials, and then gain access to the permitted Web pages. The process provides secure access to both Active Directory and NT domains because 37 passwords are not sent over the network in plain text. No security certificates are required. The successful login information is cached on the client side. This allows users to access other restricted Web pages without another login until the session expires. For Auburn University students and employees, Windows Domain login names are Auburn User IDs and Windows domain passwords are usually the same as other Auburn passwords. Users do not need to remember another set of IDs and passwords. In addition, the Auburn University office of Information Technology provides assistance in synchronizing or resetting passwords. Therefore Windows Domain login is not only secure but also convenient to users. 38 CHAPTER 4 COLLECTING THE DATA Both AUETD and GTUFS serve the function of collecting and handling data that is either from the login users through Web forms and file upload, or from other existing databases through files like spreadsheets. 4.1 Collecting/uploading data from users through ASP.NET Web forms An ASP.NET Web application consists of three parts: content, program logic, and Web configuration information. The content part determines the appearance of a Web application, which includes text, images, buttons, text boxes, and all other HTML and Web forms controls. The program logic determines how the application responds to the user actions. It makes the elements of the Web page change on the fly. The Web configuration information determines how the Web applications run on the server, how the error messages display to the users, or who is authorized to access the applications. The ASP.NET Web form is the key element of an ASP.NET Web application. Its content portion, which has the similar appearance and behaviors of an HTML page, is defined in files with the extension ?aspx.? Each aspx file associates with a program logic, which is defined in an executable file with the extension ?asp.vb? or ?asp.cs? depending on whether Visual Basic or C# was chosen when the Web project was 39 created. The executable portion of the Web forms is stored in the Dynamic Link Library, an executable, which runs on the server under the control of IIS. When the user accesses a Web form, the user can type in the information, select an option, or click a button to submit the data requested by the application. The appearance or the controls of the Web form could be changed dynamically if some actions taken by the user, such as making certain selections, trigger event processing. 4.2 File upload There are two types of file upload in AUETD and GTUFS depending on how the files are to be used. In the first type, simple data transmission upload, files are published on the Web without conversion, transformation, or validation (other than file type). In the second type, complex data transformation upload, files are extracted from databases, processed, transformed, validated, combined, and calculations are performed. 4.2.1 Types of file upload A. Data transmission upload Data transmission upload is applied in AUETD. The students upload electronic theses or dissertations in PDF (Portable Document Format) through the application. After a successful upload, the files are saved and archived in designated folders on a FrontPage server, and the information associated with the uploaded files, such as the student name, the file size, the file name, or the upload date, is stored in the MS SQL Server database. The uploaded file is published and available through the World Wide 40 Web after the application administrator, the thesis and dissertation coordinator at the Auburn University Graduate School, examines the content of the file and approves it. The data of the file is presented to the public in its original form as the author submitted it. The application serves the role of transmitting the file with data from the user to the server and posting it on the Web with certain constraints on its availability. B. Data transformation upload Data transformation upload is employed in GTUFS. The process is more complicated than data transmission upload. The tuition fellowships information is collected and stored in the DB2 mainframe database system through the On-line Auburn Student Information System (OASIS). To export the information to the database that the Graduate School can administer, a spreadsheet (the GTUFS system uses Excel) serves as an interface between the Auburn University DB2 mainframe database system and the MS SQL Server database system used for the application. After downloading the data from the AU central database to a spreadsheet and sorting the spreadsheet to match the upload procedure parameters, the GTUFS administrator imports the data into the database by uploading the spreadsheet through GTUFS. After the upload button is clicked, five functions are automatically triggered: 1. The spreadsheet is uploaded to the AU FrontPage server where the GTUFS system resides. AU FrontPage server is a Microsoft Web server with programming components and Windows tools such as Microsoft Internet Information Services (IIS), the Microsoft programming languages, the .NET 41 Framework 5 , ADO.NET database classes and tools 6 , and Microsoft Application Center Test (ACT) 7 . 2. GTUFS connects to the spreadsheet, opens it, retrieves the data in the spreadsheet, and then exports the data to a DataSet. A DataSet consists of a collection of tables, relationships, and constraints. It can be considered an in- memory cache of data retrieved from the source database. 3. GTUFS connects to the destination database in the AU Microsoft SQL Server, and opens it. 4. GTUFS refreshes the destination database by deleting all existing records in the TERM_WAIVER table and the CAREER _INFOR table (the database structure in GTUFS will be discussed in Chapter 5, ?Storing the Data?). 5. GTUFS loops through the rows in the table of the DataSet, manipulates the data, and then inserts it into the SQL server. The data manipulation includes: a. Data conversion: The fellowship amount for each student, for example, is recorded as a negative number (financial debit) in the OASIS IBM DB2 mainframe system. It is converted to a positive dollar amount before inserting into MS SQL Server. 5 The .NET Framework is the complete set of Windows programming classes. They include the ASP.NET classes as well as classes for other programming tasks such as file access, data type conversion, array and string manipulation, and so on [6]. 6 ADO.NET database classes and tools provide access to Microsoft SQL Server and ODBC databases. Data access is often a key component of Web applications [6]. 7 Microsoft Application Center Test (ACT) provides an automated way to stress-test Web applications [6]. 42 b. Data calculation: The total amount of fellowship tuition that each student has received, for instance, is not available in the source DB2 database. It is calculated while the system is looping through the data. c. Data translation: There is no data field in the source DB2 database indicating which type of fellowship the student has received. GTUFS therefore has to translate the fellowship tuition amount that the student has received into the student?s fellowship category. The record from the spreadsheet is broken down, manipulated, and then inserted into three different tables in the GTUFS database. The TERM_WAIVER table reflects the data structure in the spreadsheet. Each record represents one student for one semester. The CAREER_INFOR table is the summary of the TERM_WAIVER table after the data calculation. Each record represents one individual student. The EXTENDED_CAREER table stores administrative decisions for students. Each record also represents an individual student. Students who are new to the database (based on their social security numbers) will be added to the table during data insertion. 4.2.2 Upload file formats 1. PDF for AUETD PDF is cross platform, navigational, ultra-printable, ultra-viewable, and smaller than other conventional document formats. It is considered a universal standard. Theses and dissertations often contain images, graphics, equations, and various character sets (e.g., Greek letters). The majority of graduate students work in Windows, Mac, or UNIX 43 platforms. PDF offers reproducible results and multiplatform support for viewing and printing by reviewers if the original files are in the proper format. No other mature document type permits easy viewing by reviewers and allows authors to retain margins; keep pagination and line breaks; generate files in various word processors on PC, Mac, and UNIX systems; include and display equations; and maintain control of the position and compression of images. Finally, University Microfilms International/ProQuest, which publishes theses and dissertations, requires PDF. 2. Spreadsheets for the GTUFS upload There are essentially two types of files that can be delivered by the Auburn University Office of Information Technology through a data download from the central mainframe database ? flat files or spreadsheets. A flat file is a data file that contains records with no structured relationships. The advantage of this type of file is the file size can be relatively small and therefore the data download speed can be fast. The disadvantage is that it does not offer inherent information about the data and interpretation requires more knowledge. A spreadsheet, on the hand, still maintains the structured relationships of the records. Its file size can be substantially bigger than a flat file, but the data type is maintained and the data representation in rows and columns is close to the ones in its source database (DB2) and its destination database (MS SQL Server). The data sorting that is required before uploading into GTUFS can also be easily implemented in spreadsheets. 44 4.3 Data validation To prevent the database from becoming cluttered with unused data and to avoid improper functioning of the Web application, value validation of the data in input controls is needed. 4.3.1 Types of data validation ? client side vs. server side The main difference between client side validation and server side validation is not where the validation is occurring, but rather is in the purpose behind the validation. 1. Client side validation Client side validation is a convenience. It is performed to provide the user with quick feedback, making the application appear responsive, and giving the illusion of a desktop application. However, some advanced users could disable JavaScript on the client and use another ?browser-like? program to send direct commands to the server, which might insert corrupted or incorrect data into the database and make the application behave improperly. 2. Server side validation Server side validation, on the other hand, provides real application level security. It is done to ensure that all data sent to the server from the client is valid data, no matter how the data was entered on the client side. 45 4.3.2 What the ASP.NET data validation controls do The ASP.NET input validation controls are not only done on the server side. The client-side script can also be generated automatically with the range of validation controls that are included. They cover almost all the common validation scenarios, and there is even a custom validation control that can integrate specific non-standard requirements into the overall process [7]. 1. Make required fields mandatory An electronic thesis or dissertation without a title is not useful. In ASP.NET, the RequireFieldValidator Web control is used for a situation such as this. With the RequireFieldValidator Web control, if a student clicks the submit button before entering all required fields, the RequireFieldValidator Web control springs into action, prevents the postback from occurring, and displays an error message. To indicate a field is required, a standard solution such as displaying an asterisk (*) next to the field is applied. 2. Accept data within a range If a student claims his ETD has 0 or 200,000 pages, then it is fair to assume that the student has made a mistake. In ASP.NET, the RangeValidator Web control is used for a situation such as this. It has a Minimum Value property, a Maximum Value property, and a Type property (String, Integer, Double, Date, or Currency) to specify the lower and upper bounds of acceptable values for a specific type. 46 3. Regular Expressions Regular Expressions are a way to describe a string pattern, such as e-mail addresses, phone numbers, Social Security Numbers, or ZIP codes. For example, there are many ways to type an e-mail address incorrectly. Regular Expressions are the most efficient way to make sure all bases are covered. In .NET, one way of validating e-mail addresses with Regular Expressions is to use RegularExpressionValidator control. In its ValidationExpression property, the Web developers can choose e-mail validation from a list of the most common Regular Expressions. 4. Field comparison In some fields it is necessary to check if the value in the validated control matches the value in another control or a specific value. The CompareValidator Web control is available in .NET for this purpose. This control has a ControlToValidate property that specifies the primary validation field on the Web form, a ControlToCompare property that specifies the second field to which the primary one can be compared, a ValueToCompare property to compare the control to a value, and an Operator property to specify the kind of comparison needed between two controls. 47 CHAPTER 5 STORING THE DATA Collected data has little value unless it is stored in a fashion that will relate key data elements to all the stored information according to a predetermined set of retrieval rules. There are two major types of databases that can meet this requirement: flat-file and relational. Flat-file databases are limited to small amounts of data, but allow for simple data manipulation and permit easy human readability. Relational databases are based on a more logical structure and permit the use of multiple linked tables providing sophisticated access to larger amounts of data. Built-in functions expand the power and usefulness of the structure query language (SQL) to offer more efficient data insertion, manipulation, and retrieval. Relational databases were therefore chosen for AUETD and GTUFS because of the nature and the amount of the data collected and used in these two Web applications. 48 5.1 Storing data with ADO.NET ? the interface between the programming languages of Web applications and databases ADO.NET (ActiveX Data Objects) is the primary data access model for ASP.NET applications. From Web forms, ADO.NET, with a consistent set of objects, properties, and methods, offers standardized access and modification of data stored in OLEDB 8 or SQL databases. There are three major components in ADO.NET (see Fig. 5.1). Fig. 5.1 The three major components in ADO.NET 1. The physical data store can be a SQL database or an OLEDB. 2. The .NET data provider creates an in-memory representation of the data (a data set) from the physical data store. 3. The Dataset object is the in-memory representation of the tables and relationships on which the program code works. The .NET data provider furnishes abstraction between the physical data store and the data set. There are two data providers in .NET framework ? SQLClient and OleDbClient. SQLClient works with Microsoft SQL Server databases. OleDbClient 8 OLEDB (Object Linking Embedding Database) is used to implement Microsoft?s strategy of Universal Data Access (UDA) to access any type of data from any application (text files, spreadsheets, email, and such) from any storage device. See http://www.siue.edu/~dbock/cmis460/3-databaseconcepts.htm . Physical data store .NET data provider DataSet 49 works for other database formats such as Oracle, Microsoft Access, or Microsoft Excel. The GTUFS database update process can be used as an example to illustrate both types of database connection in ADO.NET. 1. OleDbClient Fig. 5.2 The OleDbClient example in the GTUFS database update process As described earlier in Chapter 2, ?AUETD and GTUFS,? GTUFS imports a spreadsheet that is a download of tuition fellowship information from OASIS and HRS. After the spreadsheet has been uploaded to the FrontPage server where the GTUFS program resides, an OleDbConnection object is used to connect to the spreadsheet. OleDb database connections use the OleDbDataAdapter object to perform commands and return data. A dataset is created and filled with data by the OleDbDataAdapter. An example of the source code in GTUFS is as follows: Dim conn As New OleDbConnection(myConnectionString) Dim adapter As OleDbDataAdapter = New OleDbDataAdapter Dim myDataSet As New DataSet conn.Open() adapter.SelectCommand = New OleDbCommand(myCmdToDatabase, conn) adapter.Fill(myDataSet, myDataSetName) MS Excel spreadsheet OleDbConnection DataSet 50 2. SQLClient Fig. 5.3 The SQLClient example in the GTUFS database update process Data from the dataset that is filled with fellowship information will be processed and then inserted to an MS SQL database through a SQLConnection. SQL database connections use the SQLDbDataAdapter object to perform commands and return data. The source code in GTUFS is as follows: Dim strconn As String = System.Configuration.ConfigurationSettings.AppSettings ("mySQLConn") Dim objConn As New SqlConnection(strconn) objConn.Open() Dim myCommand As New SqlCommand(myExecuteQuery, objConn) myCommand.Connection = objConn Try myCommand.ExecuteNonQuery() Catch ex As Exception Response.Write(Convert.ToString(ex)) End Try objConn.Close() SqlConnection DataSet MS SQL database 51 5.2 Database systems MS SQL Server includes a relational database. Data is processed according to the relationships between the individual data elements, by means of tables, without a requirement for special orientation rules or processes. This is best explained by examining the AUETD and GTUFS database structures. 5.2.1 Database structure in AUETD 1. Main database ? AUETD There are two tables in the AUETD database: ETD_TAB table and ADMINISTRATOR table. a. ETD_TAB table: The ETD_TAB table contains all the uploaded data from students and the administrative decisions on the student?s work. The uploaded data includes the student?s name, Auburn University e-mail address, personal e-mail address, department, major professor, committee members, graduation date, the type of work, the title of the work, the abstract of the work, the url of the uploaded file (the physical location of the file on the FrontPage server), the file name created by AUETD, the upload date, and the publication restriction set by the author. Examples of administrative decisions are the administrative approval status, the administrative comment, the responsible administrator?s user ID, and the active status of the record. b. ADMINISTRATORS table: ADMINISTRATORS table contains the list of AUETD administrators, including their Auburn global IDs and names. 52 2. Supporting database ? SCTPROD SCTPROD is updated and maintained by the Auburn University Office of Information Technology (OIT) weekly by a batch process. Data is pulled from the Student Information System and the Human Resource System, insuring accuracy and currency. The only table in SCTPROD that is used in AUETD is the GID table. AUETD takes the login user ID to match the user ID in the GID table. When the user ID is found in the table and the system confirms that the user is a graduate student by checking his or her student class in the table, the student?s last name and first name will be exported to the AUETD table and the student will be authorized to upload their research work to AUETD. 5.2.2 Database structure in GTUFS 1. Main database ? GTUFS Tuition fellowship information is collected and stored in the Auburn University DB2 mainframe database system through the On-line Auburn Student Information System (OASIS). After downloading the information to a spreadsheet, the GTUFS administrator imports the data in the spreadsheet to the GTUFS database through the GTUFS Web application (the data import process is described in Chapter 4, ?Collecting the data?). a. TERM_WAIVER table: TERM_WAIVER table preserves the data of the imported spreadsheet associated with the student?s fellowship for each term, and adds 53 new translated data to make the information understandable upon retrieval. In other words, each record row represents the fellowship information of each student for each term. The ?term-associated? data includes the term, the department that offers the fellowship, student?s degree, the type and the amount of the fellowship granted to the student, and the student?s attempted hours for the term. All data in the TERM_WAIVER table is refreshed each time a new data import is completed. b. CAREER_INFOR table: The CAREER_INFOR table contains the data that is associated with each student. Each row of the record represents one student. The ?student-associated? record includes the student?s basic information and the summary of his or her fellowship information. The student?s basic information comprises the student?s Auburn user ID, name, gender, visa type, Auburn e-mail address, and overall GPA. Examples of the student?s fellowship information are cumulative attempted hours, the first term the student received any type of tuition fellowship, the total number of semesters that the student has received tuition fellowships, and the total dollar amount of the fellowships that the student has received. The summary of the student?s fellowships information is the result of the data conversion, calculation, and translation done during the data import process. As in the TERM_WAIVER table, the entire data in the CAREER_INFOR table is refreshed every time a new data import is complete. c. EXTENDED_CAREER table: The EXTENDED_CAREER table records the administrative decisions made on each student. Each row represents the decisions mad on one student. Administrative decisions include the student?s administrative allowable hours or semesters, semester that the tuition fellowship eligibility expired, academic suspension status, nomination status updated by colleges, 54 and administrative comments. Data in the EXTENDED_CAREER table is permanent unless an administrative decision modifies it. d. NOMINATE_GRF table: The NOMINATE_GRF table records the GRF nomination status for students who are new to GTUFS (who do not have an entry in the EXTENDED_CAREER table) or for students who have an entry in GTUFS but are nominated from a college different from their enrolled college. Recorded information includes the nominated student?s ID, first name, last name, department, college, and, if the student has an entry in GTUFS, nomination college, and nomination status. The NOMINATE_GRF table is refreshed by the GTUFS administrators before a new nomination process starts. e. COLLEGE table: College access to GTUFS is controlled by the COLLEGE table. The table consists of college codes, college names, deans? Auburn user IDs, and authorized college staff?s Auburn user IDs (up to five staff for each college). f. DEPARTMENT table: Department access to GTUFS is controlled by the DEPARTMENT table. The table consists of department codes, department names, college codes to which they belong (in order to link with the COLLEGE table), department heads? Auburn user IDs, and authorized departmental staff?s Auburn user IDs (up to five staff for each department). g. ADMIN table: Graduate School access to GTUFS is controlled by the ADMIN table. It includes administrators? Auburn user IDs, names, and access levels. The higher the access level possessed the more advanced functions that the administrator can use to administer GTUFS. 55 h. DEFAULT_MAX_HRS table: The DEFAULT_MAX_HRS table records the maximum hours allowed for each degree. i. SYSTEM_INFOR table: The SYSTEM_INFOR table keeps track of information relating to the update of the student?s GTUFS record. 2. Supporting database ? SCTPROD As in AUETD, GTUFS uses the GID table to identify if the login user for student access is a graduate student at Auburn University. 5.3 Selected issues in storing the data 1. Naming scheme for uploaded files To make the data searching and retrieving easier, Web applications that store and record electronic files usually use a systematic procedure for creating and assigning file names, especially when their databases are large. The naming procedures vary depending on the purpose of the Web applications. The basic principles for a standard naming scheme are: simple but meaningful, not too lengthy, uniform with all other similar files, and unique in the database. The file names should be descriptive and understandable for users and system administrators. They should be composed in a fashion that would most represent the essence of the file content and the purpose of the applications. In AUETD, for instance, the author?s identity is most important to the file management and archiving of theses and dissertations. Therefore all files in AUETD are automatically renamed to the format 56 LastName_FirstName_RandomNumber after the file uploads are complete. GTUFS, on the other hand, is a function-driven application, as opposed to a user-driven application such as AUETD. The files are created by the application when certain functions are employed by the system administrator. Therefore the naming format in GTUFS is FunctionName_AdministerDate, which illustrates the purpose of the files. The use of random numbers in AUETD and administering dates in GTUFS make the file unique in the database. 2. FPDB folder on the FrontPage server and its security All files uploaded by users through AUETD or GTUFS are stored in the FPDB folder on the Auburn University FrontPage server where the AUETD and GTUFS programs reside, while the associated file information such as the author and the file name are recorded on the SQL database. With the FrontPage Server Extensions installed on the Web server, the FPDB folder is marked to be not browseable, scriptable, or executable. Only users with administrative or authoring privileges will be able to access it. Microsoft Access databases can be placed in the FPDB folder. However, FrontPage does not provide any database security beyond the security settings that exist in the database. The files and databases in FPDB folder are backed up by Auburn University OIT on a daily and weekly basis. 57 3. MS SQL Database backup and its security To prevent the theft or corruption of data, the databases need to be secured. There are many techniques to enforce security, such as by restricting users? access to sensitive data by means of view, as described in Chapter 3 ?Login and Security,? or by specifically creating users, logins, and roles that explicitly state the level of access to which each is entitled. [8] 1. Users: A user is someone who can use a specific database. A user with access to one database is not necessary a user with access to another database. 2. Logins: A login will be assigned permissions as a user ID, or a role, if it is associated with a user in a database. It will be assigned to a special user ID called ?guest? if it is not associated with a user. 3. Roles: A role is a collection of logins that require the same level of access to a specific database. A user or a login can be associated with any roles or no role. If the user?s responsibility changes, moving the user?s role from one to another would immediately give the user the correct access for the new role. The databases on Auburn University MS SQL server are backed up by Auburn University OIT on a daily and weekly basis. 58 CHAPTER 6 DISTRIBUTING THE DATA The Internet is known as the Information Superhighway. There is simply no easier way to transfer information than a well-designed Web site. For thousands of years the producers or possessors of information have struggled to distribute their knowledge in a timely and accurate manner. Clay tablets, books, newspapers, radio, and television have all provided partial answers to this problem. Web applications are the most recent and most efficient means for such distribution. 6.1 Distributing the various types of data through Web applications The way the application is designed to distribute the data depends on the nature of the information involved. In general, there are two different types of information distributed via Web applications ? non-restricted information and restricted information. 6.1.1 Non-restricted information ? public information Public information can be freely accessed without any user or resource identification checks. Depending on the amount of the information needed to be displayed, the interface should be designed differently. Information can be accessed through hyperlinks in the content part of the Web page, side or top bars, quick links, or 59 site maps. A more advanced technology like a search engine will be needed if there are massive quantities of information. AUETD, for instance, has an integrated search engine specifically to assist in disseminating the research done by graduate students at Auburn University. This search engine is described in section 6.2. 6.1.2 Restricted information ? private information or public information with restricted access Selecting an appropriate authentication and authorization strategy is one of the most important tasks in the application development in securing restricted information. Authentication is the process of discovering and verifying the identity of a user by examining the user?s credentials and then validating those credentials against some authority. Authorization occurs after authentication and uses information about a user?s identity and roles to determine which resources that user can access. Both AUETD and GTUFS use Windows domain login and Auburn Active Directory to authenticate, and then check access rights through the Microsoft SQL Server database and web.config files to authorize. This has been discussed in Chapter 3, ?Security and Login?. There are two kinds of authorization. One is ?role-based authorization,? in which access to resources is specified by associated user roles. Another is ?resource-based authorization,? in which an application grants access by checking a user?s IP address, the server date, or other imposed resource restrictions. [9] 60 1. Role-based authorization for private information Private and confidential information should be secured and restricted to access only by authorized users. GTUFS uses a ?role-based authorization? scheme to prevent unauthorized users from seeing student tuition fellowship information. Determining the user?s role by checking his or her identity against database records, students can only access their own information, department staff can only access information for students from their department, college staff can only access information for students from their college, and the Graduate School staff can access the fellowship information of all students. 2. Resource-based authorization for public information with restricted access In AUETD, although the published theses and dissertations are meant to be distributed to the public, students can select to have access to their work restricted for a specific period of time or to a specific IP domain. To prevent the restricted ETD from being accessed for a specific period of time, AUETD checks the server time when a user requests to view the ETD, and then compares it to the time restriction set by the ETD author. If the time of request is earlier than the restriction time, the user receives a message denying access and informing the user when the ETD will be available. To prevent the restricted ETD from being accessed from outside the Auburn domain, AUETD checks the client IP address when the user requests to view the ETD, and then compares it to the Auburn University IP domain 131.204. If the user is from 61 outside the Auburn domain, he or she receives a message denying access and information about when the ETD will be available outside the Auburn domain. 6.2 Search engine in AUETD ETDs are designed to be disseminated. The quality of data retrieval depends on the programming of the search functions, the interface presented to the users, and how the information is stored in the database. There are two major search functions in AUETD: one for general users and one for AUETD administrators. 6.2.1 Search function for general users Theses and dissertations should be searchable by keyword, author, title, or year of publication in the same fashion as searching for books or journals in a library. Because of the critical role of the major professor and the importance of knowing the author?s discipline, AUETD is also searchable by major professor and department. The search options can be categorized into three groups based on the program technique used for each: search by KEYWORD; search by AUTHOR, MAJOR PROFESSOR, or TITLE; and search by DEPARTMENT or YEAR. 6.2.1.1 Search by KEYWORD When the keyword search is triggered, AUETD searches through the abstracts or summaries that the authors have supplied. The abstract or summary field is mandatory in AUETD submission. 62 1. Search queries In the code, three SQL queries are used to produce keyword search results: LIKE EXACT MATCH, LIKE AND MATCH, and LIKE OR MATCH. For example, if the user searches for keywords ?word1 word2 word3,? the queries are as follows. Query 1 (LIKE EXACT MATCH): The condition in this query will be [where keywords like '%word1 word2 word3%']. This query generates results containing the exact phrase ?word1 word2 word3? in the abstracts or summaries. Query 2 (LIKE AND MATCH): The condition in this query will be [where keywords like '%word1%' AND keywords like '%word2%' AND keywords like '%word3%' and (keywords not in (select keywords from etd_tab where keywords like '%word1 word2 word3%'))] that generates results containing all three words (?word1,? ?word2,? and ?word3?) in the abstracts or summaries. The later part of the query (keywords not in (select keywords from etd_tab where keywords like '%word1 word2 word3%')) excludes the results with the exact phrase ?word1 word2 word3? in its abstracts or summaries. This prevents any overlap with the results from Query 1. Query 3 (LIKE OR MATCH): The condition in this query will be [where keywords like '%word1%' OR keywords like '%word2%' OR keywords like '%word3%' and (keywords not in (select keywords from etd_tab where keywords like '%word1%' AND keywords like '%word2%' AND keywords like '%word3%'))] that generates results containing ?word1,? ?word2,? or ?word3? in the abstracts or summaries. The later part of the query (keywords not in (select keywords from etd_tab where keywords like 63 '%word1%' AND keywords like '%word2%' AND keywords like '%word3%')) excludes the results with terms ?word1,? ?word2?? and ?word3? in its abstracts or summaries. This prevents any overlap with the results from Query 2. 2. Search results The results of these three queries are the same as the general LIKE OR MATCH [where keywords like '%word1%' OR keywords like '%word2%' OR keywords like '%word3%']. The purpose of dividing this general query into the three queries above is to have the results displayed in an order most logical to users ? displaying the results with THE EXACT PHRASE first, displaying the results with ALL of the words second, and finally displaying the results with ANY of the words. 3. Stopwords 9 Stopwords interfere with normal searches because of their near universality. Because these words appear in nearly every possible database entry, the inclusion of words such as A, AN, or THE in a search string is counterproductive, and returns unnecessary and unusable results. The code for employing the stopwords checking function is as follows. 9 The list of stopwords used in AUETD was provided by Dr. Juan Gilbert of Auburn University?s Department of Computer Science and Software Engineering. 64 Public Function is_stopword(ByVal str As String) As Boolean Dim is_sw As Boolean = False Dim i As Integer = 0 Dim sw() As String = {"a", "about", "above", "according", "accordingly", "across", "after", "afterward", "afterwards", "again", "against", "all", "almost", "alone", "along", "already", "also", "although", "always", "among", "amongst", "an", "and", "another", "any", "anyhow", "anyone", "anything", "anywhere", "are", "around", "as", "at", "be", "became", "because", "become", "becomes", "becoming", "been", "before", "beforehand", "began", "behind", "being", "below", "beside", "besides", "between", "beyond", "both", "but", "by", "can", "cannot", "certain", "could", "define", "did", "do", "does", "down", "during", "each", "eg", "either", "else", "elsewhere", "enough", "especially", "etc", "even", "ever", "every", "everyone", "everything", "everywhere", "example", "except", "explain", "few", "fewer", "finally", "find", "following", "for", "former", "formerly", "from", "further", "furthermore", "generally", "given", "had", "has", "have", "having", "he", "hence", "henceforth", "her", "here", "hereafter", "hereby", "herein", "hereupon", "hers", "herself", "him", "himself", "his", "how", "however", "ie", "if", "in", "include", "included", "includes", "including", "indeed", "instead", "into", "is", "it", "its", "itself", "later", "latterly", "least", "less", "many", "may", "maybe", "me", "meanwhile", "might", "miss", "more", "moreover", "most", "mostly", "much", "must", "my", "myself", "namely", "near", "nearly", "neither", "never", "nevertheless", "next", "no", "nobody", "none", "nonetheless", "nor", "not", "nothing", "now", "nowhere", "of", "off", "often", "on", "once", "one", "only", "onto", "or", "other", "others", "otherwise", "our", "ours", "ourselves", "out", "over", "overall", "own", "part", "particularly", "parts", "per", "perhaps", "probably", "rather", "s", "same", "seem", "seemed", "seeming", "seemingly", "seems", "set", "several", "she", "should", "similar", "since", "so", "some", "somehow", "someone", "something", "sometime", "sometimes", "somewhat", "somewhere", "still", "such", "tell", "than", "that", "the", "their", "them", "themselves", "then", "thence", "thenceforth", "there", "thereafter", "thereby", "therefore", "therein", "thereupon", "these", "they", "this", "those", "though", "through", "throughout", "thru", "thus", "to", "together", "too", "took", "toward", "towards", "under", "unless", "unlike", "unlikely", "up", "upon", "us", "use", "used", "using", "usually", "various", "very", "via", "want", "was", "way", "we", "well", "were", "what", "whatever", "when", "whence", "whenever", "where", "whereafter", "whereas", "whereby", "wherein", "whereupon", "wherever", "whether", "which", "while", "whither", "who", "whoever", "whole", "whom", "whomever", "whose", "why", "will", "with", "within", "without", "would", "yes", "yet", "you", "your", "yours", "yourself", "yourselves"} While i <= UBound(sw) And is_sw = False If str = sw(i) Then is_sw = True End If i = i + 1 End While is_stopword = is_sw End Function In the code, each keyword entered by the users will be checked against the stopwords list called ?sw().? If the entered keyword is one of the stopwords, it will be added to the ?stopwords string? that will be displayed to the user. An example is shown in Fig. 6.1, where the user has searched for ?the auetd? as the keywords. Because ?the? is defined as a stopword in AUETD, it is removed from the search criteria and displayed 65 as an error along with the results of the search for ?auetd? in this example. Users will then learn what words are not searchable. This technique is commonly used in search engines such as Google. Fig. 6.1 Stopwords example 6.2.1.2 Search by AUTHOR, MAJOR PROFESSOR, TITLE The techniques used in designing and implementing the search by AUTHOR, MAJOR PROFESSOR, and TITLE are very similar to those in the keyword search. The EXACT MATCH, LIKE AND MATCH, and LIKE OR MATCH queries are also used to generate the results. The formatting of names and titles is not important. For example, the name format stored in the database is ?last-name, first-name.? If users search for ?first-name last-name? in conventional name format, the same result as searching for ?last-name, first_name? will be attained. Users can also enter only the last name or the 66 first name because LIKE OR MATCH will generate anything matching either the first name or the last name. For example, when searching for "John Smith," entering "John Smith," "Smith, John," "Smith," or "John" will all produce good results. When searching by MAJOR PROFESSOR, if the user adds ?Dr.? before a professor?s name, it will not affect the searching since AUETD still searches the rest of the entry. Exact thesis or dissertation TITLE names are preferred but not required. For example, searching for "Discrete Analysis of Numerical-Based Topographic Openings," "Discrete Analysis," or "Numerical-Based Analysis" will produce similar results. The more precise the query submitted by the user, however, the better the results. Stopwords screening is not used in the AUTHOR, MAJOR PROFESSOR, TITLE section. Users should be able to search for the author name ?You Me,? even though ?You? and ?Me? are both considered stopwords in a keyword search. Titles can also logically contain stopwords. Therefore users should be able to search for titles with any words in them as well. 6.2.1.3 Search by DEPARTMENT or YEAR A search by DEPARTMENT will generate a list of every ETD produced within one of Auburn University?s academic departments. Offering the department list for selection provides ease of use and improves accuracy and efficiency. This search offers special convenience to non-Auburn University users who are unfamiliar with the university?s organizational structure. The department list is stored in the database and kept up-to-date by AUETD administrators. 67 A search by PUBLICATION YEAR will generate a list of years for which there are records in the database. The AUETD developer does not need to update the list of available years because it is generated by the code behind the interface based on the server date. 6.2.1.4 Search results (Fig. 6.2) 1. Results displayed order Search results are displayed in the following order: (1) Results with THE EXACT PHRASE (2) Results with ALL of the words (3) Results with ANY of the words AUETD prioritizes the results this way so that the closest results will always be at the top of the list, before the less expected results. If the first result does not meet expectations, users can assume the remaining results are probably not what they want. Users can then decide to rephrase their search entry or even change to another search option to obtain better results. 2. Refining search results There are two ways for users to refine search results. (1) Restricting the search terms. The more precise the query, the better the chance to find the desired results. For example, if searching for major professor ?Stephen McFarland,? entering only ?McFarland? will produce the broadest results of any 68 professor names containing ?McFarland.? On the other hand, entering ?Stephen McFarland? generates professor names with both ?Stephen? and ?McFarland? listed before those with only ?Stephen? or only ?McFarland.? (2) Searching within the results of a previous search. For example, a user wants to search for theses or dissertations published in the year 2000, in the Physics Department, and related to ?solid state physics.? The user can search for the YEAR ?2000? first and select the ?Search within result? option as shown in Fig. 6.2. The user then searches for the DEPARTMENT ?Physics? followed by another ?Search within result? for the KEYWORD ?solid state physics.? The ?Search within result? mechanism remains active once selected. Therefore search results will be continuously refined until the user selects the ?New Search? option and initiates a different search. 3. Number of items displayed with each search Users have the option to choose the number of results to be displayed on a page. The default number is five. For common screen resolutions, five are the most that can be viewed without moving the scroll bar. 4. Sorting the list By selecting the header on a list of search results, users can resort the results by title, ETD type, department, number of pages in the ETD, file size, author, or date of submission. The default order displays the most recently submitted ETD first. Selecting the same header a second time rearranges the results in the opposite order. 69 5. Status bar A status bar is displayed at the top of each search result list to remind users of their current search parameters. This feature is especially helpful in situations where ?Search within Result? is used several times. If stopwords are entered when searching by keyword, they will be appended to the status bar so that users know the words are not searchable in AUETD. Fig. 6.2 The search result page of AUETD 6.2.1.5 Interface design considerations for the general search function Many ETD sites choose to use relatively complex search interfaces with Boolean options required for the users. Such an interface, as shown in Fig. 6.3, creates the following disadvantages. 70 1. Users feel overwhelmed and do not know where to begin, because there are too many drop-down menus and textboxes for one search. 2. Multiple Boolean options confuse general users. For example, ?condition 1 AND condition 2? is easily comprehended, but ?condition 1 AND condition 2 OR condition 3 AND condition 4 OR condition 5? is confusing. 3. If users search for ETDs based on many criteria at one time and are not satisfied with the results, it would be difficult for them to identify which criteria caused the problem. 4. A keyword search and a main search (for example, Author, Department, or Title) performed separately would produce less refined results. For example, users can not search for theses or dissertations that were published in the year 2000 in the Physics Department related to ?solid state physics.? This interface separates the keyword search from the department search and from the year search. Fig. 6.3 The ETD search page of Chung Yuan Christian University (http://thesis.lib.cycu.edu.tw/ETD-db/ETD-search/search) 71 Compared to the above interface, AUETD is designed to be easily understood by users (see Fig. 6.4 and Fig.6.5). The general design considerations for the AUETD search function are as follows. 1. Because the search function is truly essential to electronic thesis and dissertation dissemination, it holds a prominent position on the first page of AUETD. In addition, there is a ?Search AUETD? button on the banner of every page of AUETD. At any point, users, whether they are general users, authorized users, or administrators, may search the entire database. 2. The link to ?List ALL Theses, Dissertations, or Projects in AUETD? is available for users interested in viewing the entire collection of AUETD. By default, the results are sorted by date of submission ? the most recent work displays first. Users can choose to sort the list by title, type of work, department, number of pages, file size, or author. 3. The search function is displayed as a single horizontal bar with three major fields. i. The first major field is a drop-down menu that allows users to select a search by title, author, keyword, department, major professor, or year. ii. The second major field is linked dynamically to the first major field. If title, author, keyword, or major professor is selected in the first field, AUETD generates a text box for query entry. If department or year is selected in the first field, AUETD generates a drop-down menu for query selection. 72 iii. The third major field is a ?Go!? button that executes the search function and displays the results. 4. AUETD searches are not case sensitive. AUETD converts all capital letters to lower case. Therefore, searches for "John Smith", "JOHN SMITH", and "john smith" will return the same results. 5. The ?Search Help? page is available for users with questions regarding the search function. 6. The required field validation is implemented in AUETD. If a user hits the ?Go!? button without first entering data or selecting an option, the search mechanism will not be triggered and an error message will be displayed. Fig. 6.4 The first page of AUETD 73 Fig. 6.5 The search page of AUETD 6.2.2 Search function for AUETD administrators AUETD administrators needing to edit a specific ETD record or a group of similar ETD records use a different search mechanism. 6.2.2.1 Search criteria In the administrator search function, if the User ID of the ETD author is known, administrators can immediately locate all records associated to that author because the User ID is unique to every author. One user ID may have authored multiple ETDs. If the User ID is not available for search input, administrators can locate the records associated with the author by searching for any combination of first name, last name, type of work, department, and publishing year. All search criteria entered by administrators are 74 considered part of the search condition, which is an AND search in query language. Administrators needing to locate a group of similar records (for example, all theses from the History Department published in 2002) will search by department, type of work, and submission year. (Fig. 6.6) Fig. 6.6 The administrator search page of AUETD 6.2.2.2 Search results An administrator search displays a list containing student names, types of work, departments, approval statuses, and submission dates (see Fig. 6.7). Administrators can select the target record from the list to edit information, approve or disapprove the ETD, replace the existing ETD file with another, or delete the whole record (see Fig. 6.8). 75 Fig. 6.7 The administrator search result page Fig. 6.8 The administrator edit/approve page 76 6.2.2.3 Interface design considerations for the administrator search function As with the general search function, the sorting capability and the query status bar are made available. Required data validation is not implemented here. If a search parameter is not entered into a query field, that field will be ignored in the search condition. 6.2.3 Conclusions about the AUETD search engine The key to the efficient and effective distribution of online knowledge is a good search utility. A solid search engine not only benefits the user browsing for information, but also supports application administrators in processing information. The design objective of the AUETD search function was to present a clean, easy to use, simple to learn, and fast to operate interface. AUETD satisfies this objective by offering multiple search parameters, avoiding unrelated search results, allowing free form query entry without formatting restrictions, providing a simple resort capability for search results, and supplying a straightforward method for refining search queries. These qualities allow AUETD to achieve the goals of encouraging the public to use AUETD while promulgating Auburn?s graduate research. 77 CHAPTER 7 CONCLUSION For collecting, storing, and distributing data, dynamic database-driven Web applications offer a convenience and efficiency never before seen. Providing users an up- to-date view of information stored in databases, such applications make this information available to users on demand, with all the advantages and conveniences offered by the Internet. ASP.NET is the platform chosen in this study to create Web applications and to promulgate graduate research and disseminate student information. ASP.NET is not the only means for achieving these objectives. Other technologies, notably the CGI, are also widely used for development of dynamic Web services. What makes ASP.NET favorable here are the advanced features it offers and its integration with Microsoft server, data access, and security tools. Microsoft SQL Server was chosen for this study for its smooth incorporation with ASP.NET. Authentication and authorization are important elements for developing secure sites. Among the authentication providers supported by ASP.NET, Windows authentication provider was discussed in this study because Auburn University offers the Windows Domain login mechanism utilizing Microsoft?s Active Directory Services. The Windows Domain mechanism delivers a convenient and secure system to 78 authenticate users with support and maintenance from the Auburn University Office of Information Technology. Configuration files and user information in MS SQL Server are used to authorize authenticated users. Collecting, storing, and distributing data are the three main tasks associated with the process of promulgating and disseminating information through Web applications. This study examines the problems of and solutions for issues related to these tasks, including means of collecting data, data validation schemes, data handling, database management, data access security, search engines, flexible user interface, and client browser capabilities. The Auburn University Electronic Thesis and Dissertation (AUETD) and Graduate TUition Fellowships System (GTUFS) are used as examples to illustrate this process. 79 REFERENCES [1] T Modi. (2004). Client Side Data Validation: A False Sense of Security. http://www.javaworld.com/weblogs/javadesign/archives/000187.html . Java World. [2] R. Birdwell, O. Cornes, C. Goode, A. Krishnamorrthy, J. T. Llibre, C. L. Miller, N. Raybould, D. Sussman, C. Ullman. (2001) Beginning ASP.NET using VB.NET. Birmingham, B27 6BH, UK. Wrox Press. [3] S. Worley. (2002). Inside ASP.NET. Indianapolis, IN: New Riders. [4] J. Conard, P. Dengler, B. Francis, J. Glynn, B. Harvey, B. Hollis, R. Ramachandran, J. Schenken, S. Short, C. Ullman. (2000). Introducing .NET. Birmingham, B27 6BH, UK. Wrox Press. [5] D. S. Platt. (2003). Introducing MICROSOFT .NET. Redmond, WA: Microsoft Press. [6] J. Webb. (2002). MCAD/MCSD Self-Paced Training Kit: Developing Web Applications with Microsoft Visual Basic .NET and Microsoft Visual C# .NET. Redmond, WA: Microsoft Press. [7] R. Anderson, B. Francis, A. Homer, R. Howard, D. Sussman, K. Watson. (2001). Professional ASP.NET. Birmingham, B27 6BH, UK. Wrox Press. [8] R. Dewson. (2001). Beginning SQL SERVER 2000 Programming. Birmingham, B27 6BH, UK. Wrox Press. [9] A. A. Yazdani. (2005). ASP.NET Security Models. ASP.NET PRO. 80 BIBLIOGRAPHY [1] Modi, T. (2004). Client Side Data Validation: A False Sense of Securit: http://www.javaworld.com/weblogs/javadesign/archives/000187.html Java World. (2 Aug. 2004) [2] Moxley, J. M. (n.d.). The guide to electronic theses and dissertations: http://etdguide.org/ . (3 Aug. 2004) [3] Birdwell, R., Cornes, O., Goode, C., Krishnamorrthy, A., Llibre, J. T., Miller, C. L., Raybould, N., Sussman, D., Ullman, C. (2001). Beginning ASP.NET using VB.NET: Wrox Press. (7 Oct. 2004) [4] (2003). Networked Digital Library of Theses and Dissertations (NDLTD): http://www.theses.org/ . (4 Nov. 2004) [5] University of Stuttgart (2001). Electronic Papers, Thesis and Dissertations of the University of Stuttgart: http://elib.uni-stuttgart.de/opus/doku/english/index.html . (14 Nov. 2004) [6] Weisser, C., Baker, J., Walker, J. R. (1997). Electronic theses and dissertations: problems and possibilities: http://www.cas.usf.edu/english/walker/papers/etds/bibliog.html . (4 Feb. 2005) [7] University of Texas ? Austin (2003). A digital library of dissertations and theses, hosted by UMI: http://wwwlib.umi.com/cr/utexas/main . (27 Aug. 2005) [8] Worley, S. (2002). Inside ASP.NET: New Riders. (7 Oct. 2005) [9] California Institute of Technology (2003). Electronic theses: http://library.caltech.edu/etd/ . (7 Oct. 2005) [10] Chung Yuan Christian University (n.d.). Electronic theses and dissertations: http://thesis.lib.cycu.edu.tw/ . (7 Oct. 2003) 81 [11] Clough, K. (2003). Internet Search Engines: http://www.searchengineguide.com/searchengines.html . (21 Oct. 2005) [12] Auburn University (n.d). Auburn Windows Domain: http://php.auburn.edu/oit/support/glossary_definition.php?id=148 . (21 Oct. 2005) [13] Conard, J., Dengler, P., Francis, B., Glynn, J., Harvey, B., Hollis, B., Ramachandran, R., Schenken, J., Short, S., Ullman, C. (2000). Introducing .NET: Wrox Press. (25 Oct. 2005) [14] Dewson, R. (2002). SQL Server 2000 Programming: Wrox Press Ltd. (26 Oct. 2005) [15] Platt, D. S. (2003). Introducing MICROSOFT .NET: Microsoft Press. (27 Oct. 2005) [16] Microsoft corporation (2004). ADO.NET: Update a Database from a DataSet: http://quickstart.developerfusion.co.uk/QuickStart/howto/doc/adoplus/UpdateDataFromDB.aspx . (3 Nov. 2005) [17] Webb, J. (2002). MCAD/MCSD Self-Paced Training Kit: Developing Web Applications with Microsoft Visual Basic .NET and Microsoft Visual C# .NET: Microsoft Press. (11 Nov. 2005) [18] Anderson, R., Francis, B., Homer, A., Howard, R., Sussman, D., Watson, K. (2001). Professional ASP.NET: Wrox Press. (11 Nov. 2005) [19] Esposito, D. (2002). Building Web Solutions with ASP.NET and ADO.NET: Microsoft Press. (23 Nov. 2005) [20] Yazdani, A. A. (2005). ASP.NET Security Models: ASP.NET PRO. (4 Dec. 2005)