You are here
Program Evaluation Toolkit (how to assess outreach programs)
What is Program Evaluation and why is it important?
The following section is adapted from Assessing Campus Diversity Initiatives [see References, Garcia, et al.] and the User-Friendly Handbook for Project Evaluation [see References, NSF]. First of all, not everyone enjoys the thought of conducting program evaluations. It’s important to consider why evaluation is useful and how it can contribute to your program.
“Evaluation is not separate from, or added to, a project, but rather is part of it from the beginning”
p. 3, NSF User-Friendly Handbook for Project Evaluation. Program evaluation is a way of monitoring your progress, success and lessons learned; it’s a way of telling your program’s story.
Reasons for Evaluation:
- Produces useful knowledge
- Documents and clarifies useful work
- Addresses concerns identified by the community being served
- Contributes to shaping institutional policy
- Allows for immediate corrections based on findings
- Provides context
Types of Evaluation
Summative and formative evaluation strategies need to be threaded throughout your site evaluation. The purpose of program evaluation at CISE REU sites is to examine how well the program has impacted retention in CISE majors and recruitment in CISE graduate programs.
Formative (program improvement):
- How is program implemented?
- Are activities delivered as intended? Fidelity of implementation?
- Are participants being reached as intended?
- What are participant reactions?
Summative (program outcomes and accountability):
- To what extent can changes be attributed to the program?
- What are the net effects?
- What are final consequences?
- To what extent are desired changes occurring? Goals met?
- Who is benefiting/not benefiting? How?
- What seems to work? Not work?
- What are unintended outcomes?
Steps of Evaluation
Step 1: Logic Models
A logic model is a conceptual framework that describes the pieces of the project and expected connections among them. A typical model has four broad categories of project elements that are connected by directional arrows: inputs, activities, short term outcomes, long term outcomes [NSF User-Friendly Handbook for project Evaluation].
- Project inputs= the various funding sources, resource streams and contributions that provide support to the project.
- Activities= the services, materials, actions, and events that characterize the project’s thrusts, or reach the target audience.
- Short-term impact= immediate results of the activities.
- Long-term outcomes= the broader and more enduring impacts on the system (i.e. individuals, groups, communities, and organizations).
A logic model identifies these program elements and shows expected connections among them. Project leaders may find this model useful not only for evaluation but also for program management. It provides a framework for monitoring the flow of work and checking whether required activities are being put in place. [NSF User-Friendly Handbook for Project Evaluation]
University of Wisconsin Extension includes the following diagram and has a downloadable template at their website: UWEX. [NSF ADVANCE]
Step 2: Human Subjects IRB Approval
Each university has an internal review board to oversee that ethical research practices are employed throughout the campus, as guided by federal regulations. Whenever conducting any surveys of student attitudes and behaviors, as well as collecting any institutional data about students for reporting purposes, institutional approval is warranted. Each institution determines the organizational policies and procedures for human subjects research. As a project leader, you are strongly encouraged to contact your IRB office to discuss the specific scope of your site evaluation, and to determine appropriate protocol.
Please refer to the sample REU site IRB application form.
Step 3: Data Collection & Assessment
There are a variety of ways that programs can collect information on participants and outcomes. The scope of this section is to provide an overview of primary ways of gathering information from site participants to measure the intended attitudinal outcomes and to also address the political issues that are involved. This is not intended to be a comprehensive resource for data assessment; refer to http://www.starsalliance.org/eaReadings.
Before Data Collection
- Be sure to obtain necessary permission and clearance
- Institutional Research approval of studies with Human Subjects
- Consent from participants
- Inform Institutional Research of any changes in the study
- Be sensitive to the needs of participants
- Anonymity will be limited in small samples such as REU sites
- Confidentiality of personal information
- Cause as little disruption as possible
- Be sure that the data is handled and analyzed by an adequately trained, objective and unbiased individual
In the current climate of accountability, use of mixed-method approach to data collection is highly desirable. When investigating human behaviors and attitudes, use of a myriad of collection methods enriches the study by minimizing the weaknesses of any one method. Use of both quantitative and qualitative data collection serves to triangulate findings, i.e. to substantiate outcomes.
A typical mixed-method approach in a STARS Computing Corps might consist of the following data collection methods: surveys, individual student interviews, focus groups, knowledge tests, observations from faculty, and document reviews (of student presentations, posters, papers). Each of these are described below.
Surveys – Surveys are easy and cost effective methods of data collection. Most universities provide web based software for delivery of surveys (such as Survey Share, or Survey Monkey). If not, accounts can be quickly established for reasonable rates. Surveys provide standardization, descriptive data, coverage of constructs (e.g. interest in graduate school, attitudes towards computing, self efficacy, commitment to computing), and ease in analysis. A pre – post design is strongly encouraged to allow measurement of program impact, meaning that the same survey is administered to students at the beginning of the program, and again at the end of the program. A disadvantage of note is self-report bias, and may lack depth and context. Outreach for Computing Attitude Survey for Secondary Students (OCASSS).
Individual Interviews – By conducting interviews, we assume that the opinions and experiences of participants are meaningful, and that their perspectives impact the project success. Interviews may be structured (no deviation in questions among interviewees), semi-structured (some variation based upon the interviewer’s discretion), or unstructured (usually in depth interviews based on the interviewees subjective experience). Interviews are an effective means to explore students experience, perspective, expectation, and what is salient to them. Interviews allow depth of exploration and provide insight beyond a survey questionnaire. Disadvantages to conducting interviews are that they are time consuming, require extensive interviewer training, and lengthy data analysis process. Consult with social science departments on your campus to locate faculty expertise and assistance.
Focus Groups – Focus groups enable in depth information to be collected, similar to individual interviews, with an advantage of providing efficiency in that they are conducting in small groups of 8 – 10 participants. As with interviews, focus groups allow for insight into the subjective experiences and reflections of participants, with the catalyst being group interaction. However, focus group facilitators should be trained in delivery and in group dynamics, to catalyze open participation from everyone and minimize the dominance of a small minority in the group. If group peer pressure is of concern at your site, focus groups are not recommended. Ideally, the facilitator is not a faculty advisor to the students.
Knowledge Tests – A test of the students’ knowledge of computing skills is an effective measure of gains made from participating in STARS Computing Corps programs. Tests are generally perceived as credible, and provide an objective measurement. The content being assessed will depend upon the particular areas addressed by the program and the instrument must be carefully evaluated to ensure that it is valid and reliable. Self report (via survey) on knowledge gains is an acceptable measure.
Observations – Observations can provide valuable insight into the behaviors of participants and how the program is operating. Observations identify unexpected events and outcomes and are easy to collect. However, observations are subjective and may not necessarily apply to all participants.
Document Review – Tracking the documents produced during the program are an easy and cost effective method of triangulation. Any submitted projects, designs, papers, posters, presentations, etc., can be archived as evidence of outcomes. Use of a rating rubric is recommended. See http://edtechteacher.org/index.php/teaching-technology/assessment-rubrics and http://teachingcommons.depaul.edu/Feedback_Grading/rubrics.html for additional information.
Following up with participants after their experience can provide meaningful information, such as whether or not students actually pursue computing studies. However, cost and time are a consideration upon termination of funding. While longitudinal surveys are not required, they are recommended. Short and cost effective surveys can be delivered online to former student participants, in order to obtain computing retention information.
[This section is an adaptation from the ADVANCE portal, the NSF User-Friendly Handbook for Project Evaluation, and Assessing Campus Diversity Initiatives (Garcia, et al.)]
Step 4: Analyze & Report
The following section provides an introduction to common statistical analyses. In many cases, Project Leaders do not have background in quantitative social sciences or other fields that use multivariate statistics. Consulting with other faculty or professionals on your campus who have expertise in multivariate statistics is recommended. For detailed information on statistical analyses in behavioral sciences and program evaluation, we recommend you consult the following resources.
- UCLA’s Academic Technology Services (Choosing appropriate analysis)
- Sample Size Calculator (determines the number of individuals needed for statistical power in your analysis)
- The Web Center for Social Research Methods: by Bill Trochim, Cornell University
- Tabachnick, B. G. & Fidell, L.S. (2007). Using Multivariate Statistics (5th Ed.). Boston, M.A.: Pearson Education, Inc.
Conducting qualitative analyses is a complex and time consuming endeavor, that can be well worth the effort depending upon the project. Typically, qualitative study is a precursory step to quantitative investigations, and can also serve as a method of triangulation of existing quantitative findings. This is by no means a comprehensive resource, but a short list of recommended sources for exploring the appropriate use and conduction of interviews, focus groups, and other qualitative methods.
Top Blogs for Qualitative Research: http://www.qual360.com/news-and-blogs/11-editor-s-pick-top-qualitative-research-blogs
RTI International: http://www.rti.org/page.cfm/Qualitative_Analysis
American Evaluation Association: http://www.eval.org/
Dissemination of project findings usually has multiple levels, that can be generalized into two overarching areas. One area is the broad dissemination of the effectiveness of the program, and its effectiveness to the various constituent groups. Another area is the program products that may be disseminated throughout the community. Dissemination vehicles vary, from reports to the funding agency, institutional leadership, to conference presentations or local press releases. The intent of this section is to address the broad level of program efficacy as delivered to institutional audiences.
Prior to any dissemination of site information, knowledge of the audience is key. This enables the report or presentation to discuss site findings and outcomes that are important to the audience and in relevant terminology. Always answer the question: what is in it for the audience? The following list suggests common constituent categories from which you can begin generating your individual listing.
Constituent Groups (Audience):
- The National Science Foundation and the Department of Education both require formal reporting of site activities and outcomes on an annual basis, with a culminating final report at grant termination. Identify contributions to research knowledge base as well as the overall accomplishment of project goals as identified. This report will include a detailed technical summary and executive highlights. Refer to the Reporting Cycles and Templates section for NSF reporting specifics.
Potential Funding Sources
- You may be seeking additional funding of the work undertaken by your project. Identify contributions and outcomes that link to the target goals for the funding you seek.
- This type of information dissemination provides details of program initiatives and results, and may include site information. The focus is most likely to communicate contributions that directly impact individuals and community policy and/or resources. Formats could include slide presentations, discussion forums, poster sessions, or short articles and case studies. Share results and lessons learned in the context of colleagues.
Institutional & Community Partners
- This type of information dissemination may be less formal, provide fewer technical details, and will focus on raising awareness, informing the institutions, and possibly calling for support and collaboration. Formats could include slide presentations, discussion forums, newsletters, fact sheets, or short articles. Share results and interpret meaning in the context of linking your goals to those of the division.
STARS Participant Schools