What data files do I need?

Updated by Mary Styers

So, you may have started to think about some interesting research goals, but now you need assistance understanding the various data sources needed to run the analyses. We get it. It can be a lot to figure out on your own, especially if you’ve never collected data specific to your particular research goal(s) before. This resource page will help you determine which data files you will need, how to find them, and what to do if data is unavailable. 

Why do you need data? 

Data that you're already collecting or planning to collect provides the answer to your desired research goal(s). One piece of data may answer your entire research goal (e.g., usage analysis) or more than one piece of data may be needed (e.g., outcomes analysis or cost analysis). This is because different data answers different questions. We’ll help you determine which data you’ll need for your specific research goal(s).

What data do you need? 

Depending on your research goal(s), there are four main types of data that you will need to run a rapid cycle evaluation (RCE) with IMPACT™:

  1. Usage data
  2. Student information system (SIS) data
  3. Outcome data
  4. Pricing data

The following sections provide additional information about each of these types of data. 

For more information on preparing your data for RCE with IMPACT™, please head over to this help page or view our Data Use Guide.

✔ Usage data 

Usage data represents the extent to which a user (e.g., student) participated or engaged with the edtech tool. Examples of usage data (also known as metrics) include minutes on system, days of use, total number of logins, total lessons attempted, number of lessons completed, and number of lessons mastered. 

Only one usage metric is needed to run a RCE.

How do I find usage data? Edtech vendors often allow administrators to export usage data. Depending on the edtech tool that you choose, you may need to contact your district administrator in charge of the intervention or the edtech company directly to get usage data. 

Our team can help you determine whether a vendor's usage data is appropriate for RCE with IMPACT™.  

✔ Student Information Systems (SIS) Data 

Student Information Systems (SIS) data consists of student identifiers and demographic data. Example data include student ID, grade level, school, race/ethnicity, gender, special education status, free/reduced lunch status, and teacher. 

There are two ways that IMPACT uses SIS data:

  • To account for differences in the main analysis: IMPACT uses SIS data as a covariate in the main analysis, because SIS differences across students, classrooms or schools may influence the study outcome. For example, the comparison group might have more students receiving special education services. By including an indicator for special education services, we can account for this variation between the groups so that it does not influence the results.
  • To examine differences in results by subgroups: IMPACT allows you to examine overall results and results for specific subgroups. If you include SIS data, you will be able to drill down into usage and performance within SIS subgroups (e.g., grade level, gender, school). For example, did using the edtech product result in better learning outcomes based on grade level?

How do I find SIS data? You may need to contact your district's SIS Administrator or Data Manager to either give you access to that data, or to include them in the data discovery process. 

SIS data is optional when running a RCE regardless of your research goals.
Include as much SIS data as possible to rule out the effects of other variables in order to gain confidence that effects can be attributed to the intervention. 

✔ Outcome data 

Outcome data include measures of educational outcomes that the edtech product claims to impact. Examples include standardized assessment, degree attainment, or attendance. There are two types of outcome data:

  • Pre-intervention outcome data refers to performance or scores prior to using the edtech product. We use this data as a covariate in IMPACT to account for differences in student performance before the intervention (i.e., edtech product) was available to students.
  • Post-intervention outcome data refers to performance or scores after using the edtech product. We use this data to examine student performance and impact after receiving the intervention.

In an Outcomes Analysis without a Comparison Group, you examine the relationship between edtech product usage and your post-intervention outcome. As such, outcome data are required for this type of analysis.

In an Outcomes Analysis with a Comparison Group, you examine if students who use the edtech product outperform other students, based on post-intervention outcome performance. Outcome data are required for this type of analysis.

How do I find outcome data? You may need to contact your district's Assessment Director. This person will not only help you access outcome data, but will likely also point you to what data the district finds meaningful. 

It is important to choose an outcome that you expect to be influenced by the edtech product.

✔ Pricing data 

Pricing data includes the cost of the edtech product subscription or license, priced per student or site. Pricing data may also include any direct and indirect costs associated with owning the product (e.g., professional development, staff hours). Pricing data are included in a Cost Analysis. 

How do I find pricing data? Pricing is usually entered during procurement and contract management and, as a result, is pulled from LearnPlatform with no additional data entry. Alternatively, price information can be submitted while setting up the RCE if it has not been shared with LearnPlatform prior to running a Cost Analysis. If you are taking this approach, you may need to contact someone within your organization who has purchasing information for the edtech tool of interest.

Check out the RCE Data Collection Checklist for a helpful analysis of your files before submission.

What if the data you need is unavailable or inaccessible? 

There are many reasons the data might not be available. Sometimes data is inaccessible because the vendor will not grant access or the data is not currently being collected by the vendor. Certain edtech tools or vendors may not have an administrator or do not have a data collection mechanism in place. Other times, the data is presented in an unusable way (for example, a PDF file vs. a spreadsheet). If data to answer the question is not available, the research goal cannot be answered, which is why it is important to check what data is available as you are formulating your research goal. 


How did we do?


Powered by HelpDocs (opens in a new tab)