Evaluating Digital Infrastructure for Education Research
The right evaluation question is rarely whether a platform looks feature-rich. It is whether the system can carry the study design, the evidence burden, the governance requirements, and the operational reality of the work.
Why this matters
Education research infrastructure is often assessed through demos, procurement checklists, or implementation convenience. Those views matter, but they are too thin on their own. A research-ready system has to preserve meaning, comparability, stewardship, and accessibility while remaining operationally usable.
In brief
- Feature lists are a weak proxy for research readiness.
- Infrastructure should be evaluated against study design, not just delivery convenience.
- Meaningful event models, metadata, and governance have to be visible early.
- Accessibility and operational fit belong inside infrastructure review, not after it.
Digital infrastructure is often chosen by asking whether a platform can do the thing a team needs right now. Can it host the activity, deliver the session, export a package, or provide a dashboard? Those questions are understandable. For education research, they are incomplete.
A tool can deliver a workflow and still be poorly aligned with the study it is meant to support. It can flatten important differences between cohorts, infer what should have been captured directly, collect more personal data than the work needs, or rely on accessibility and support labour that only becomes visible after rollout.
Start with the study design, not the demo
SoLAR's 2025 definition of learning analytics is useful because it treats the field as more than reporting. It frames analytics as collection, analysis, interpretation, and communication in support of theoretically relevant and actionable insight.[1] Jisc's Code of Practice reinforces the same point from a governance angle by emphasising validity, transparency, and stewardship.[2]
That means infrastructure evaluation should begin with study questions. What conditions need to be distinguished? Which stages, cohorts, facilitation modes, or intervention variants must remain visible in the data? If a platform cannot represent those distinctions clearly, a long feature sheet does not rescue it.
Evaluate data meaning and provenance
In research settings, the central issue is not whether a system logs activity. It is whether the logged activity carries stable meaning. xAPI, xAPI Profiles, and Caliper all point in the same direction: event definition and taxonomy matter.[3][4][5]
FAIR adds a further requirement: data has to remain findable, accessible, interoperable, and reusable enough that later interpretation is not guesswork.[6] In practice, that pushes evaluation teams to ask a harder set of questions.
- What events are defined explicitly, and what do those events mean?
- What metadata is captured for context, provenance, study condition, and timing?
- Which measures are direct captures, and which are later inferences?
- How are instrumentation changes versioned across releases or cohorts?
If the answers are vague, the platform may still operate well enough as software, but it is weaker as research infrastructure.
Evaluate governance and minimisation early
Governance is often treated as a legal review that happens after a tool already exists. That sequence makes institutional work harder. DELICATE, Cormack's framework, and ICO guidance all converge on the same practical lesson: governance and privacy have to be built into analytics practice rather than attached to it later.[7][8][9][10]
For evaluation teams, this means reviewing data flow rather than policy language alone. What personal information is actually needed? How long is it kept? What access routes exist? Can the collection be explained clearly to participants and institutional reviewers? A system that cannot answer those questions cleanly is expensive to govern even if it is attractive to buy.
Evaluate accessibility and operational adoption
Accessibility should not be separated from infrastructure review. W3C's WCAG 2.2 remains the practical baseline for public digital systems.[11] If a platform is difficult to use with a keyboard, weak on headings or focus states, or unclear in its media alternatives, that is not a cosmetic issue. It changes who can participate and how reliably the workflow is delivered.
Operational adoption matters for a similar reason. 1EdTech's implementation guidance stresses that effective learning analytics depends on institutional process, not just a standard alone.[12] When a platform requires hidden support labour, manual repair, or constant explanation, the institution is not evaluating only software. It is evaluating a continuing operational burden.
Evaluate interoperability and change over time
Research infrastructure rarely stays fixed. Cohorts change, interventions evolve, and reporting needs become more precise once work is underway. A useful evaluation therefore asks what happens after the first release. Can the platform preserve comparability when instrumentation changes? Can it move data or content through standard export routes? Can implementation notes survive staff turnover?
This is where institutional teams often discover the difference between a pilot that looks impressive and a platform that can stand up to repeated use. The more a system relies on undocumented local knowledge, the weaker it becomes as shared infrastructure.
A five-part scorecard
For practical review, it helps to compress the evaluation into five lenses.
- Can the system represent the study design faithfully?
- Can the data remain meaningful, contextualised, and comparable?
- Can governance and minimisation be explained and operated cleanly?
- Can people actually use the workflow accessibly and consistently?
- Can the platform survive change, integration, and scale without hidden repair work?
That scorecard will not produce a perfect procurement decision on its own, but it is a stronger starting point than treating infrastructure evaluation as a comparison of headline features.
Closing
The most expensive mistake in education research infrastructure is often not buying a weak platform. It is buying a platform that appears capable until the research, the governance, and the operational burden all become visible at once.
That is why infrastructure evaluation has to look underneath the demo: at the event model, the metadata, the accessibility baseline, the provider boundaries, and the way a tool will behave once it is asked to support real studies rather than a sales narrative.
References
- Society for Learning Analytics Research. Reimagining Learning Analytics. 2025. Source
- Sclater N, Bailey P. Code of Practice for Learning Analytics. Jisc. Source
- Advanced Distributed Learning. xAPI Specification (IEEE 9274.1.1). Source
- Advanced Distributed Learning. xAPI Profiles Specification. Source
- 1EdTech. Caliper Analytics. Source
- Wilkinson MD, Dumontier M, Aalbersberg IJJ, et al. The FAIR Guiding Principles for scientific data management and stewardship. DOI
- Drachsler H, Greller W. Privacy and analytics - it's a DELICATE issue. DOI
- Cormack A. A data protection framework for learning analytics. DOI
- Information Commissioner's Office. Principle (c): Data minimisation. Source
- Information Commissioner's Office. Data protection by design and by default. Source
- World Wide Web Consortium. Web Content Accessibility Guidelines (WCAG) 2.2. Source
- 1EdTech. Six Steps for Effective Learning Analytics Implementation in Postsecondary Education. Source
Continue the review