This work was supported by the University of Tampere and the Finnish Ministry of Education. The cost of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material over the Internet. Currently, we provide two fully computer-based courses, “Introduction to Bioinformatics” and “Bioinformatics in Functional Genomics.” Here we will discuss the application of distance learning in bioinformatics training and our experiences gained during the 3 years that we have run the courses, with about 400 students from a number of universities. The courses are available at bioinf.uta.fi.
Bioinformatics is a multidisciplinary field that combines methods and approaches from biology and biomedicine with computer science, mathematics, and statistics. Bioinformatics is used to store, analyze, process, and manipulate diverse information relevant to biological systems. Bioinformatics is an integral part of many modern laboratory experiments. With the recent flood of genome data, the demand for bioinformaticians has risen sharply, both for developing new ways to extract knowledge from data (“bioinformatics developers”) and for performing analyses and data mining for discoveries of scientific and commercial value (“bioinformatics expert users”). The technological advances in large-scale gene and protein expression studies facilitate generation of massive datasets that need bioinformatical analysis. Currently, there is a severe worldwide lack of experts in bioinformatics [1–2].
In addition to a shortage of bioinformaticians, there is also a global shortage of educational resources. Thus far, bioinformatics has been a sideline both in biology and in computer science, and special curricula for bioinformatics have been started only in the last few years [3–5].
Most publicly available bioinformatics data and tools are Internet-based. Many analyses can be performed directly on Internet servers, or the analysis programs are distributed on the Internet. Therefore, distance learning is well suited for training in bioinformatics, because it makes the student do the work exactly as it is done in real life.
We at the Bioinformatics Group of the Institute of Medical Technology, University of Tampere, develop introductory and advanced distance learning courses to form a virtual bioinformatics suite. Our distance learning concept is to provide self-paced courses to be taken anytime, anywhere; all that is needed is Internet access. Courses have been designed for easy learning. A “Net Tutor,” an experienced teacher of bioinformatics, a real person, is available and provides help and guidance whenever needed. Distance learning allows maximal learning for students of various starting levels. We offer individual guidance based on the students' backgrounds and their aims and interests.
COURSES IN VIRTUAL BIOINFORMATICS SUITE
Currently, we provide two courses, “Introduction to Bioinformatics” and more specialized “Bioinformatics in Functional Genomics,” at bioinf.uta.fi. Each course equals 2 CU (credit units in the Finnish university system, or 2 weeks of solid work, 80 h), or 3 ECTS (European Credit Transfer System) credits. In the near future, we will release “Structural Bioinformatics,” another 2-CU module in our suite. After that, the next addition will be “Protein Modeling,” a hands-on course of 3 CU.
The aim of the courses is to give a profound knowledge in the key areas of bioinformatics and to teach the theory and use of the current methods. After the course, the students should be able to apply bioinformatics in their own work and research, and to evaluate critically the obtained results. The courses contain validated, concise lists of the most essential data sources, which are freely accessible in our web site. Our courses are available throughout the academic year, without any fixed dates for starting. From our experience, a concerted effort is most efficient, so there is a time limit for finishing the course once it is started.
The introductory course includes the background and theory of most essential bioinformatics databases and methods in DNA and protein sequence comparison and analysis, in protein analysis, and in protein three-dimensional structure and visualization. The themes of “Bioinformatics in Functional Genomics” are genome projects and large-scale sequencing projects; sequence variations, including single nucleotide polymorphisms, mutations, and polymorphisms, and their animal models including knockouts; bioinformatics in microarray and proteomics analyses; and interaction, signaling, and metabolism networks.
The courses are free for students of the University of Tampere, whereas others are educated as a paid service. The current fees for the introductory course are 120/600 (for academic students in Finland/others) and 180/900 for “Bioinformatics in Functional Genomics.” In general, it is the department, institute, or research group that pays for a student, and this is a very cost-efficient way to offer these courses, compared with arranging live teachers.
To facilitate the build-up, development, and continuous running of the courses, we have designed and implemented our own learning environment, composed of a Learning Management System (LMS)11 and a Student Management System (SMS). The guiding principles in system development were to provide a light and inexpensive platform that facilitates versatile distribution of course material and guarantees safe and error-free handling of student data. It was important that the system could combine different parts of the system smoothly and allow reusability in different courses. The total cost effectiveness was another key factor.
Because all the course material is presented as HTML pages, there was no need for special e-learning or course-building software. With license payments for commercial courseware, our courses would have been substantially more expensive, and probably would not have attracted nation-wide participation, which has enabled us to keep developing our courses.
THE STUDENT MANAGEMENT SYSTEM (SMS)
A number of routine tasks have been automated, including e.g. registration and student follow-up, by using cgi scripts that perform automatic mailing and data storage upon submission of HTML forms (Fig. 1). Nevertheless, we want to maintain the ultimate control in human hands, so the courses are not completely automated.
The SMS registry is our central tool to integrate all student data, starting from registration, going through completion of learning goals and assignments, and ending in sending the course certificate. It is an MS Excel file containing seven interlinked tables, each one specializing in some task, like registration data, course progress, statistics, or user ID management. Most tables are equipped with auto filtering, conditional formatting, and other tricks that help in extracting and visualizing relevant parts of all student data quickly (for example, only students currently on the course). Some tables generate ready-made Unix scripts for updating access control and password lists, or ready-made HTML code for the “Teacher's View.” We also have direct hyperlinks from the registries to assignment answers, learning diaries, etc. Separate SMS registries are maintained for each course.
In addition to the master registries, the SMS consists of HTML forms, cgi scripts that process the forms, data files stored by the scripts, and HTML pages generated dynamically by the scripts. Course certificates are printed using mail merge properties of MS Word, with the student data taken directly from the SMS registry.
Some dynamically created web pages are intended for the private use of the students themselves, some for teachers and course staff only. Visibility of the private pages is controlled by user authentication at course login, or in the case of staff pages at staff login when opening the “Teacher's View.” The Unix cron tool for timed execution of tasks is used in our SMS for running the Perl scripts that generate and E-mail automatically the messages containing user IDs and passwords.
The principle in SMS is that no data should be entered more than once. So, when a student enters her application, it is automatically stored in a temporary registry, and notifications are mailed to us automatically. When we receive a confirmation that the student is actually backed up by her department, we can move the data in the appropriate SMS master registry (direct copy and paste). At registration, the student data is simultaneously placed in the appropriate course directory to serve in generating personalized pages. Fig. 1 shows how the SMS works at student registration.
We estimate that we currently spend 1–2 hours per week in course-related system management (not counting development) and handling of student data, diplomas, etc., which would give an average of under 1 hour per student. Replying to inquiries and other precourse correspondence takes us another hour per week.
THE LEARNING MANAGEMENT SYSTEM (LMS)
The LMS facilitates the interactions between the student and the tutor. It is tightly integrated with the SMS to keep track of the course progress.
The central elements in LMS are HTML forms for submitting assignments or free questions or feedback and for maintaining learning diaries, and the cgi scripts that handle the forms. The teachers use hyperlinks from the SMS registry or LMS web pages that give access to the course work of any student and allow giving feedback, which is both written to the student's private pages and E-mailed to the student. In addition, the web server log files document the student's presence in the course. The answers, questions, and various logged events are saved in LMS log files in the course server. Different parts of the LMS data are visible to teachers and students, as shown in Fig. 2, which depicts how the actions of the student are handled by LMS. The automated part in LMS produces HTML pages. The “Teacher's View” allows looking into all data of all students. “My Page” shows the student's own answers, teacher's comments, and personal data. User authentication restricts the access so that no student can see the data of other students.
Fig. 3 is a view to our LMS from the teacher's angle, showing how the registry and web pages help to give suitable feedback to facilitate the student's learning.
In addition to the communication means provided inside the course, the students are encouraged to contact the tutors by E-mail and/or interactive chat as well. Free-form communication is actually the most important part in learning management and support. Automated parts allow the tutor to manage the course routines with minimal effort and spend more time in real interaction and teaching. The data provided by LMS and “Teacher's View” are helpful for choosing an appropriate guidance style for each student.
The course material has been specially designed for distance learning. The pedagogic principles are to provide experiences and to learn by doing, by applying the methods and information sources just like in bioinformatics research. The course material includes interactive reading material and hands-on lessons. There are a number of carefully selected links to on-line databases and tools, quiz-type exercises, and written assignments, all supported by personal on-line tutoring. In the introductory course, the learned skills are tested and consolidated in a personal practical computer project. The “Bioinformatics in Functional Genomics” course allows the choice between doing several mini-essays, solving a real research problem, doing a large number of small hands-on exercises, or some combination of the above that is meaningful for the student's learning goals.
The contents of the courses are flexible to fit to each student's previous knowledge and personal needs and interests. A deep learning of all the tools and databases that are introduced would be a learning goal beyond all reasonable limits, so in addition to getting an overview about the whole field, each student can focus their detailed learning to things they find most useful.
A bioinformatics glossary, which is an essential tool for the students, is a fixed part in all our course pages. In addition, we refer Finnish students to the English-Finnish bioinformatics dictionary .
We do not arrange formal examinations for our courses. The introductory course is completed after the student has submitted an acceptable report of the project to prove that one can perform analyses using basic methods of bioinformatics and understand and interpret the results and their limitations. In addition, the course logs and small assignments document that the student has actually worked through all of the course material. In “Bioinformatics in Functional Genomics,” all course work is assembled into an on-line learning diary. The course is completed when the diary is judged to contain enough notes and problem solving by the joint decision of the student and the tutor.
At all times, the students can obtain help and support from the Net Tutor. Students can use E-mail or the feedback/question forms inside the course to contact the Net Tutor to discuss their problems. On-line chat using MSN Messenger can be arranged for individual or small group discussions at a set time or when necessary.
The major function of the Net Tutor is to act as a “coach” for students, to allow them to learn by experimenting, and to provide help, guidance, and support when needed. The contacts between the student and tutor are very frequent. There are on average 15 contacts per student in the introductory course and even more in the functional genomics course. This is certainly a lot more than in a regular classroom course, meaning that every student has to take an active role in learning, which would ordinarily be the case for only the most outspoken students in a classroom situation.
For the teacher, answering most e-mailed questions is very quick, but reading and commenting project reports and learning diaries is fairly time-consuming. We estimate that we spend something like 4 h per student in personal teaching and evaluation.
EXPERIENCES AND LESSONS
Our virtual bioinformatics courses were established in 2001. There is a great need for this kind of course. Although bioinformatics is a specialized subject, we have already had 408 students registered in our courses (363 in “Introduction to Bioinformatics,” and 45 in “Bioinformatics in Functional Genomics”) from 26 countries or nationalities (mostly foreign students in Finnish universities or Finnish students taking our course from abroad, plus some cases with no Finnish connection). Our introductory course is a part of the curriculum in the three largest Finnish Universities, those of Tampere (biotechnology), Helsinki (biochemistry), and Turku (biology), and semi-obligatory in several Finnish graduate schools. In addition, “Bioinformatics in Functional Genomics” is in the biotechnology curriculum at the University of Tampere. Both of our courses were part of the special 1-year bioinformatics training course arranged by the Institute for Extension Studies at the University of Tampere in 2002. Besides students who take our courses as modules in their graduate or post-graduate degrees, we have attracted a large number of post-doctoral researchers from all major Finnish universities and even some from abroad.
The courses have been recognized elsewhere, too. The Ministry of Education in Finland awarded the National Quality Prize for Advances in Web Teaching to our Virtual Bioinformatics project in 2002.
Despite the great possibilities of distance learning in bioinformatics, only a few courses are available . The S-Star Bioinformatics Course is based on streaming video lectures along with synchronized slides, discussion forums, and assessments . The course material is freely available on the Internet . To obtain a certificate, students have to participate in lectures and discussions following a fixed schedule. These courses have been arranged once per year. The University of Manchester offers several distance learning courses in bioinformatics , comprising a full MSc program, offered against rather expensive fees. These course modules can be taken only at given times, once or twice per year. There are lots of tutorials in bioinformatics in the Internet, but they are mainly introductions, without possibility of obtaining credits.
Another project aiming at comprehensive, good-quality teaching materials in bioinformatics is called the European Multimedia Bioinformatics Educational Resource . The termination date for this project was in April 2003, and thus far they have provided a prototype web course for protein sequence analysis .
Students ought to have a basic understanding of biochemistry and molecular biology and basic computer skills for taking advantage of our courses. Basic university-level biochemistry/molecular biology is sufficient for the biological framework in the introductory course. We provide links to other educational sites for those who need training in protein and/or DNA concepts, but we require students to have a certain background knowledge. The course applicants need to estimate their level in different areas of biological and information technology (IT) knowledge, and based on that we may suggest them to take further studies before they start our bioinformatics courses. IT knowledge and skills are not formally required in the currently running two courses, except for use of E-mail, Internet, and word processing.
By now, a great majority of our students have come from the fields of biological sciences, including biochemistry. Twenty percent have had their major studies in IT, engineering, mathematics, or statistics, and 3% in medicine or pharmacology.
We have received almost exclusively positive feedback from students. Actually, evaluation of the course and suggestions for improvement are mandatory parts of the course as much as the estimation of the development of one's own skills during the course. Many students get help in their own research when studying the course material, or by making a project related to their own research interest. This impression of ours, gathered from inspection of free-format written feedback, was further confirmed by a survey, which we ran in November 2003.
In our survey, we sent an E-mail request to everyone who had completed our “Introduction to Bioinformatics.” From the 176 addresses that were still functional, we got 132 web form replies (75%, after deleting obvious duplicate submissions). The ratio of biology versus IT background in the answers was nearly identical to that of all students, and the response percentages were similar for all years of our course (2001 to 2003), so we take the sample to be representative. On a scale of 1 to 5, the mean of overall satisfaction score was 4.3. The skills and knowledge they learned in our course were estimated as very useful in their work and studies (mean score 4.3). When asked how well it worked as a virtual course, the students scored our course 4.4. Finally, no less than 84% admitted that they have actually recommended this course to others (a yes/no question).
Identification of students is an important issue in distance learning courses. We rely on contacting the students in their personal E-mail addresses (Figs. 1 and 3), and keep an eye on an individual and consistent touch in their work. Because institutes and research groups are paying the fees, they also share our concern. We have also established guidelines to deal with malpractices, even though only in one case thus far a student has tried to use unethical methods to pass the course. Still, we cannot exclude the theoretical possibility of having a paid stand-in do the work, and in the end it is up to every institute how they want to verify their students before transferring our course certificate into formal academic credits. In the case of students from our own university, we assume that they would be willing to take a viva voce examination on the content of their course project at any time if doubts arise, and we are ready to co-operate if another institute wants to investigate whether their students did the work themselves. However, we do not see cheating as a threat to the credibility of our courses, since most students take these courses for a personal need of bioinformatics skills. In addition, our courses are optional modules in most curricula, so the students seem well motivated, coming to our courses for the content, not for the credit.
Our courses were specially designed for distance learning. Web-distributed lecture notes or streamed videos of normal lectures do not make optimal material for distance learning. There are several commercial distance learning platforms available, but none of them have all the special properties that we needed. Although we apply a number of web techniques, the technology is never an end in itself. It has been used to facilitate easy learning and to provide experiences, in addition to making student administration easier in a non-stop course. The technology in our learning and student management systems has worked really well, and there have been hardly any noticeable breaks in the accessibility of the course pages.
Good distance learning courses are in fact not that easy and fast to set up. They require a dedicated person or a team with knowledge of both the subject and the course technology. The initial investment in time and effort is quite high, but then the course is fairly easy to keep up-to-date. However, maintenance is a key issue for successful course material, especially in a fast-developing area such as bioinformatics.
Distance learning courses are not the best choice for everybody and every course. They suit very well for bioinformatics, because even live courses would include practical sessions in the Internet. It is crucial to have a person who is responsible for answering the student questions, so that help to problems comes without delay (in our system, the next working day at the latest).
The distance learning concept has been a success in bioinformatics. We as teachers have been happy with the system and we have a large number of satisfied students. It has taken lots of effort to construct the courses. On the other hand, we could not have trained this many students during normal lecture/practical session courses. In addition, now students can take the courses at times most suitable for them.
We are very much in debt to Ilkka Lappalainen and Juha Ollila, our coteachers; Jukka Lehtiniemi, web designer; and Hannu Korhonen, system manager.
The abbreviations used are: LMS, Learning Management System; SMS, Student Management System; IT, information technology.