COMPX241 Project List

Frisbee Golf, AoE

Project Manager: Musawar Ahmad
Team Members: Cameron Stewart; Devesh Patel; Daniel Hoskin; and Cuan Rose
Meeting Time: Wednesdays 1-2pm, TC.4.15

Key Idea: Combine real-time body tracking with a 3D globe to translate a player’s physical throwing motion into a simulated Frisbee flight in a shared virtual environment.

Keywords: Computer Vision; 3D Graphics; Physics Simulation; Client-Server Software Architecture.

Frisbee golf is typically played in parks and open spaces, where players throw a disc towards a target in as few throws as possible. But what if the entire world became your course? Down 5th Avenue in New York, across the Waikato River, or even between landmarks in your hometown—the possibilities are endless.

This project explores how modern web technologies can be combined to create a location-independent, physically inspired version of Frisbee golf. Using a 3D globe such as CesiumJS, players select a starting point and a target anywhere on Earth. When it is their turn, they mime a throwing motion in front of their device. A pose detection system—such as MediaPipe or OpenPose—captures aspects of the movement (angle, speed, direction), which are then mapped to a simulated Frisbee trajectory rendered in the 3D environment.

The goal is not to achieve perfect physical realism, but rather to create a convincing and engaging interaction where the player feels that their motion directly influences the outcome. A simple physics model—parabolic motion with optional wind or obstacles—is more than sufficient to produce a satisfying result.

To help keep the project within scope, the experience can be structured as a turn-based multiplayer game, where players take turns throwing towards a shared target. A lightweight backend (e.g., using WebSockets or a cloud service) can be used to synchronise game state between participants.

Things to ponder:

Gesture interpretation: How much of the body motion is needed? Is tracking the arm sufficient, or can additional cues (e.g., wrist flick) enhance the experience?
Gameplay design: How are courses defined? Are they fixed, or can players create their own “holes” anywhere in the world?
Environment interaction:Is there a way to factor in the physicality of buildings to affect the flight of the Frisbee?
Visual feedback: How is the trajectory shown? A simple arc, or something more expressive?

Thoughts on gettting started:

The main choice for the base system would seem to be between Cesium JS and Google Maps' JavaScript API 3D. In both systems it is possible to layer in Google Photorealistic 3D Tiles.
Take some time to review Cesium sandcastle demos. The following one caught my eye, which places Open Street Map buildings into Cesium. (Note: not all of the live demo's work "as-is" due to non-backwards compatible updates to the Cesium API.)
For the Google Maps' 3D API, this colab noteboook caught my eye.
For either platform you will eventually want to move to installing and running the code locally, in which case you might like to move straight to setting up and working with example code directly on your own computer.

Te 0AD

Project Manager: Charles Serrato
Team Members: Callum Matthew; Lloyd Nguyen; Aidan Pine; Zach Brough; and Dylan Simpson
Meeting Time: Mondays 2-3pm, K.B.07

Key Idea: A Māori-centric extension to 0AD.

Keywords: 2D Graphics; 3D Graphics; Game Modding.

A year or so back I attended a think-tank meeting in Christchurch focused on Māori aspirations in the digital realm. At the meeting, it was noted by some participants that you rarely see Māori people represented in video games, which must be having a profound effect on identity within the Māoridom. In considering the promotion of Māori culture specifically to their own tamariki—nevermind further afield—such an omission is particularly damaging as it is less about the game not including characters their kids would more readily identify with, rather it results in different role models being actively promoted instead. A similar observation was made about the values represented in a wide range of video games.

These observations have remained with me, and when I came across the Open Source 3D Real-time Strategy (RTS) game 0AD, modelled after the genre-defining game Age of Empires by Ensemble Studios, I saw the opportunity for a project that could do something about this. In 0AD, as in the original Age of Empires, there are many races a user can play in the game, but not the Māori people. There are many types of geographical regions you can play in, but not specifically Aotearoa. Different types of flora and fauna are available, but not forms indigenous to these shores.

The aim of this project, then, is to develop a NZ-centred version of the game to play. Not just in terms of the graphics provided, but also in terms of aspects to the value-system embodied in gameplay. In times of conflict the Māori people have a well earned reputation as fierce warriors. Less well known about, in the large, is their history as a trading nation and for innovation in agriculture. The good news is that 0AD has a plugin architecture so game mods can be made. This is the starting point for this Smoke and Mirrors project.

Useful Links:

Getting the source code
Instructions on compiling up the code base
These instructions were successfully completed on a 64-bit Ubuntu 24 distribution of Linux running using VirtualBox on a Windows 10 laptop. The VDI image providing the Ubuntu 24 distribution was downloaded from here .

Thoughts on gettting started:

Immerse yourself in the source code for this project. Learning how to compile it all up will be invaluable.
In the first instance look to use VM platform such as VirtualBox or VMWare running a distribution of Linux, so it is easy with admin rights to install the prerequisites needed, before you then look to compile up the source code.
Once you have a successful build on one of the group's computers, it will be easier to figure out next steps as to how others might install and run the software.

Zork: The Illustrated Trilogy

Project Manager: Daniel Aneke
Team Members: Muna Hashi; Ben Jarrett; Vincent Queenin; and Yicheng Wang
Meeting Time: Wednesdays 2-3pm, TC.4.15

Key Idea: Reimagine the classic text adventure Zork by taking its Open Source codebase and augmenting its gameplay with AI-generated visuals, improved natural language input, and atmospheric audio to bring its world to life.

Keywords: Natural Language Processing (NLP); Programming Languages; Generative AI; Audio Processing.

Zork is one of the most iconic text adventure games ever created. Developed in the late 1970s, it was originally written in a Lisp-derived language known as ZIL (Zork Implementation Language), and later compiled to run on the Z-machine—a virtual machine designed to allow the same game to run across a wide range of hardware platforms. Players explore a richly described world using text commands, relying on imagination to visualise their surroundings. Commands such as “go north”, “open door”, or “take lamp” were interpreted by a sophisticated parser for its time, allowing for a level of interaction that helped define an entire genre of games.

This project explores what happens when that imagination is supplemented—rather than replaced—by modern technology. By building on open-source implementations of Zork I, II, and III, the aim is to create an enhanced version of the game that introduces visual, auditory, and interaction-based elements while preserving the core gameplay experience. One possible technical pathway is to treat the original Z-machine-based system as a black box, while another is to experiment with translating parts of the codebase into a more modern language such as JavaScript, potentially with the assistance of coding-focused LLM tools.

A central feature of the project is the visualisation of the game world. Rather than generating images entirely on demand, a practical approach is to begin with a statically generated set of images—one for each location in the game—created using generative AI tools. Systems designed to maintain stylistic consistency across multiple images (for example, tools such as Google’s Nano Banana-style workflows) may be particularly useful here. These base images can then be extended dynamically: when the player revisits a location, the original prompt can be augmented with additional context describing the current game state (such as objects present or actions taken), and a refined image generated to reflect those changes. This creates a balance between consistency and responsiveness.

The user interface can also evolve beyond text alone. For example, the player’s character could be displayed visually on screen, with inventory items appearing on or alongside the character as they are collected. As items are dropped or used, they are removed from this visual representation. Similarly, encounters with other entities in the game—such as monsters—could be supported with corresponding imagery. During interactions such as combat, visual cues (for example, changes in appearance or indicators of damage) can be used to reflect the current state of both the player and their opponent.

Interaction with the game can be enhanced through more flexible language input. Instead of requiring strictly formatted commands, natural language processing techniques can be used to interpret more conversational input, translating it into the sequence of actions understood by the underlying game engine. Voice input could also be incorporated, further broadening how players engage with the system.

Audio provides an additional layer of immersion. Different locations can be accompanied by subtle environmental sounds—dripping water in caves, echoes in underground chambers, or ambient noise in open spaces—helping to reinforce the atmosphere described in the text.

The goal is not to replace the charm of the original, but to explore how its core ideas—exploration, imagination, and interaction through language—can be extended using contemporary tools. The end result should feel like a modern reinterpretation of a classic, rather than a complete departure from it.

Potentially useful resources:

ZIL development tools (ZILF – Zork Implementation Language compiler):
https://github.com/taradinoc/zilf
Google Gemini Image Generation (Nano Banana / Gemini 3.1 Flash Image API):
https://ai.google.dev/
OpenAI Speech-to-Text (Whisper API):
https://platform.openai.com/docs/guides/speech-to-text
ElevenLabs Sound Effects (text-to-sound generation):
https://elevenlabs.io/sound-effects

The above APIs are provided as representative examples to help get you started. It is recommended that you begin by experimenting with the many browser-based “try it out” tools available, using them to explore what is possible and refine your design choices. As many major platforms now offer suites of generative AI services, there may also be advantages in selecting tools from a single ecosystem when building your solution.

Thoughts on gettting started:

When thinking about the why? for this project, the delivery of Zorks: The Illustrated Trilogy works best if what is run as the game is an authentic version of the software—not a brand new implementation that is sourced the data files of the project but not guaranteed to be the same.
To hit this mark, however, this does not straight-jacket the whole project to have to code everything in the ZIL programming language. Far from it. The suggested programming paradigm to use is to write a wrapper program in a more popular and commonly used programming language, such as JavaScript, that forms a text-based bi-direction communication channel between it and the 100% authentic ZIL bytecode running instance of Zork.
- The wrapper program forms the outward facing interface to the user.
- Based on input from the user, the wrapper program can stream old-school text to the actually running game. The game then outputs, as a result of this input, which is returned to the wrapper program.
- The wrapper program analysis the text that is returned, and decides how the enhanced user interface is going to be updated.
The fundamental communication mechanism being described above is called a pipe. It allows output from one running process to be streamed to another process as input. In the given description it is important to note that the communication is one-directional, not bi-directionaly as is needed by the wrapper program to work. This is achieved by forming one pipe from Program A to Program B, and to then form a second pip from Program B to Program A.
This bi-directionaly communication setup between two programs is such a useful setup that many programming languages provide packages that do the heavy lifting for you, delivering a class that you can use to carry out bi-directional data communications. Note: It is more accurate to describe the communications as between two processes, rather than two programs, as this is strictly speaking what is running on the CPU. Full Disclosure: When setting up this in practice you actually end up with three streams (pipes) that data is sent over. This is because a running program can output data to "standard-out" but also "standard-error", where the latter (as the name suggets) is designed for error messages, carrying a higher level of significance, and open important to be able to separate out from regular output the program generates.

Operation Anatomy Explorer: In all Shapes and Sizes

Project Manager: Riley Cooney
Team Members: Payten Conder; Adibah Humayun; Tanisha Prasad; Keir Ward; and William Hernandez
Meeting Time: Mondays 2-3pm, K.B.07

Key Idea: Inspired by the formation of the Graduate School of Medicine at Waikato, develop an interactive 3D anatomy learning system that helps users understand and identify structures of the human body—particularly the skeleton—through exploration, visualisation, and testing.

Keywords: Graphics; Interactive Visualisation; Human-Computer Interaction; Web Technologies.

While the origins of this project lie in reimagining the board game Operation in a digital setting, the focus has shifted more strongly towards education. The aim is not to recreate the mechanics of the original game, but to explore how interactive 3D technologies can support the learning of human anatomy in a meaningful and engaging way.

Understanding human anatomy is fundamentally about spatial reasoning: knowing not just what structures are called, but where they are, how they relate to one another, and how they are oriented in the body. In early medical training, this is often reinforced through “spot tests”, where students are asked to identify structures on physical specimens. This project explores how such learning can be supported through an interactive digital environment.

Beyond basic anatomy learning, the project opens up a number of interesting extensions. The proportions of the model could be adjusted to explore variation between individuals, or to better represent different body types. In a biological or forensic anthropology context, the system could be used to annotate skeletal models with additional information—for example, marking sites of trauma or visualising patterns of injury. There is also potential to incorporate motion, such as illustrating the gait cycle during walking, linking structure to function.

With some traction gained on implementation, the project is likely to extend beyond a traditional desktop setting. The same ideas could be explored in more immersive environments, such as virtual or augmented reality, where users can interact with anatomical structures in a more embodied and spatially intuitive way.

The goal of this project is to explore how 3D visualisation and interaction can support learning in a domain where spatial understanding is essential, and to investigate how different forms of interaction influence that learning experience.

Potentially useful resources:

Z-Anatomy – open-source 3D anatomical models
Open3DModel – extended anatomy model project
Three.js – core JavaScript library for 3D graphics in the browser
GLTFLoader – loader for .glb / .gltf models

Thoughts on gettting started:

Look to develop the project as a web app.
Experiment with browser-based viewer for models. There are completed open-source apps that do this, however much of the heavy lifting is done by general-purpose libraries, so it is not too difficult to also roll-your-own.
Invest time understanding how the different available models are structured.

Debate Me!

Project Manager: Josephine Kumar
Team Members: Cooper Ladd; Jack Smithson; Luka Jones; Kinnon Broekhuizen; and Kiday Ven Sun
Meeting Time: Mondays 2-3pm, K.B.07

Key Idea: Create an interactive AI-driven avatar of a political figure that responds to questions using publicly available material, combining text, voice, and visuals to explore how AI can represent real-world viewpoints.

Keywords: Web Scraping; Large Language Models (LLM) prompting; Speech and Visual Emulation; Web Technologies.

Public debate is a central part of how societies form opinions, challenge ideas, and make decisions. No more so than when a country goes to the polls in a General Election. Traditionally, engaging in debate requires access to other people, preparation time, and a willingness to argue a position in real time. This project explores how AI can lower these barriers by providing an interactive environment where users can debate with simulated participants.

The system would allow a user to select or define a persona to debate against. This could be a generalised viewpoint (for example, “a climate policy sceptic”) or a more specific, character-driven voice inspired by real-world figures. The focus is not on perfectly replicating individuals, but on capturing consistent styles of reasoning, tone, and argumentation.

At its core, the project investigates how large language models can be guided to produce structured, coherent arguments over multiple turns of interaction. Rather than generating isolated responses, the system should maintain context, track positions taken, and respond in a way that reflects an ongoing line of reasoning. This raises interesting challenges around memory, consistency, and the representation of “stance”.

A key strength of this project is that it naturally supports different technical roles within a team:

One strand of work can focus on language and argument modelling. This involves collecting examples of how a particular figure or viewpoint is expressed—such as public speeches, interviews, or social media posts—and using these to guide or condition a language model so that it produces responses in a consistent style and with a recognisable stance.
Another strand can focus on voice modelling. By collecting publicly available recordings of a person speaking, it is possible to explore tools and APIs that generate synthetic speech with similar vocal characteristics. This involves preprocessing audio data, experimenting with speech synthesis systems, and integrating the resulting audio into the application.
A third strand can explore visual representation through avatars. Given a set of images, students can investigate tools that generate a visual character and animate it in sync with spoken audio. This may involve working with browser-based 3D frameworks or specialised avatar-generation platforms.
A further strand can focus on the interaction and system design, including how debates are structured, how turns are managed, and how feedback is presented to the user. This includes designing interfaces that support different formats such as free-form discussion, timed exchanges, or structured argumentation.

This project intentionally explores a provocative space: how closely can a system emulate the voice, appearance, and rhetorical style of prominent public figures? The aim is not to deceive, but to better understand the capabilities and limitations of current AI technologies. This exploration should be carried out in a way that is respectful of individuals in the public eye and mindful of the broader implications.

One possible approach is to distinguish between controlled and public-facing outputs. A restricted demonstration environment could be used to explore more realistic emulations of well-known figures for a limited audience. In contrast, any publicly disseminated version of the system should adopt a more abstracted approach, where the personas represent aggregated viewpoints or generalised archetypes rather than identifiable individuals.

Given the ambitious scope of the project within a limited timeframe, it is important that the development plan includes staged milestones, with provision for some of the more advanced goals—such as fully realised real-time 3D avatars—not being fully achieved. An initial milestone might involve a text-based interaction, where the user enters a query and receives a response expressed in a style consistent with a chosen political viewpoint. A natural progression from this is to incorporate speech synthesis, allowing responses to be delivered in a voice that reflects the characteristics of the persona. Beyond this, the system could be extended to include visual elements, such as AI-generated images that place the persona in contextually relevant scenes aligned with the topic of discussion. These staged developments allow the project to deliver meaningful outcomes at each step, while still working towards a richer, multimodal interaction experience.

Potentially Useful Links

OpenAI API – language models and speech interfaces
ElevenLabs – voice synthesis and voice cloning
Coqui TTS – open-source speech synthesis toolkit
RVC (Voice Conversion) – train models to mimic specific voices
D-ID – talking avatars from images and audio
Synthesia – AI video avatars
Ready Player Me – 3D avatars for web/VR
Three.js – browser-based 3D rendering

Thoughts on gettting started:

Experiment with separate Hello World style programs that demonstrate individual aspects of what this project needs—for example voice cloning, 2D/3D avatars, forming a text corpus—to establish what is possible.
Then plan how the different parts can be brought together to deliver the overall vision of the project.

The Mood of the Nation

Project Manager: Eli Murray
Team Members: Abigail Wong; Zaina Fathima; Enqi Huang; Aina Shehryar; and Pwint Sapal
Meeting Time: Wednesdays 1-2pm, TC.4.15

Key Idea: Analyse publicly available online content to model and visualise the “mood of the nation,” using interactive tools to explore trends and shifts in opinion over time.

Keywords: Natural Language Processing (NLP); Data Mining; Sentiment Analysis; Data Visualisation.

Public opinion is constantly being expressed through online platforms—news comments, social media posts, blogs, and forums. This project explores how these distributed signals can be collected and analysed to build a picture of how people feel about important topics, both at a point in time and as those views evolve.

The system would ingest text data from one or more sources and apply natural language processing techniques to extract indicators of sentiment, topics, and trends. Rather than focusing solely on whether opinions are positive or negative, the project encourages a richer interpretation: identifying themes, contrasting viewpoints, and how different issues rise and fall in prominence.

A key aspect of the project is how this information is presented. Raw analysis is unlikely to be meaningful on its own; instead, the system should provide clear and engaging visualisations. This could include dashboards showing how sentiment changes over time, comparisons between topics, or geographic breakdowns where data permits. The emphasis is on helping users explore and interpret the data, rather than simply displaying it.

This project naturally supports a range of technical roles within a team:

One strand can focus on data collection and ingestion, including accessing APIs, scraping publicly available data, and managing datasets. This includes dealing with issues such as rate limits, data cleaning, and storage.
Another strand can focus on natural language processing and analysis, applying techniques such as sentiment analysis, topic modelling, and keyword extraction. There is scope to experiment with different tools and compare how their outputs differ.
A third strand can focus on visualisation and interface design, developing dashboards or interactive views that allow users to explore the data. This may involve web-based visualisation libraries and careful consideration of how best to represent complex information.
A further strand can focus on system integration and architecture, ensuring that the different components—data ingestion, analysis, and presentation—work together in a coherent and responsive system.

Potentially Useful Links

OpenAI API – language models for text analysis
Hugging Face Transformers – NLP models and pipelines
spaCy – industrial-strength NLP library
VADER Sentiment – sentiment analysis tuned for social media
D3.js – data-driven visualisations
Chart.js – simple web-based charts
Observable – interactive data visualisation platform

Thoughts on gettting started:

BlueSky looks to be a good choice for a Social Media platform with an API that allows you to access, monitor and filter the full stream of messages that pass through it minute by minute (The "Firehose").
Example jetstream program that monitors for #nzpol tagged messages.
Output of the example programming running for around a week in JSONL format.
I would also recommend you look at software frameworks—such as RSSHub—that allow you to take a webpage where someone posts regularly, and turn it into a machine-readable feed that can be scanned over time for new content.

Pub Quiz Buster!

Project Manager: Annycah Libunao
Team Members: Ramla Omar; Alexandra Toal; Crystal Chooi; and Muska Zak
Meeting Time: Mondays 2-3pm, K.B.07

Key Idea: The range and depth of knowledge required to answer pub quiz questions is surprisingly broad. Do you know which Scrabble tiles score between 2-4 points, the population of Iceland, or who wrote Nothing Compares 2 U, made famous by Sinead O'Connor? I didn't (last night!) ... but I'd like to!! This project explores how software—through gamification and interaction—can make learning such an eclectic mix of facts engaging and memorable.

Keywords: Linked Data; Educational Gamification; Web Development.

The key to this project is devising a way to learn that isn't too monotonous or boring. (Solve that one in a general way and you'll have educationalists kissing your feet!). For example knowing the populations of countries from around the world could be done by presenting lists, getting the user to memorise them, and then test them on it. Boring! Instead, how about you present a challenge for a particular continent (say Africa), and then have jigsaw pieces of the various countries that have to be moved into the correct place, and add to that a graphic that shows how populous that country is. Variations and extensions abound: you could require the jigsaw to be solved by placing the pieces in order of population, largest to smallest; or the piece plays the national anthem when it is correctly put into place, and some additional facts about the country come up; or the jigsaw piece shows the flag of the country within it, but doesn't include the name, and you have to drag-and-drop the name of the country on to it as well. Then with a click of a button, you change some of the parameters you wish to be learning about (continent, area of country not population, etc) and go again.

So that sounds like a fun way to revise for Geography questions. Now to figure out some gamification for other categories such as Music, Sport, Art and Literature ...!

Useful Links:

What exactly is a Pub Quiz? (Wikipedia)
Linked Open Data
DBpedia as source of knowledge representation/Ontology

Thoughts on gettting started:

Get along to one of the Hamilton pub quiz nights! Believe it or Not is a syndicated operation, with several pubs in Hamilton running the quiz this company puts together each Tuesday night.
Spend time brainstorming around a broad set of gamification ideas: I suspect you will find you need different concepts for different types of rounds.
I would strong encourage this to be a web-based project.
Plan your software architecture so the different activities can be brough together into one unified app. Inheritance could very well be the key here.

David's Take on Richard's House of Games

Key idea: develop a computerized version of Richard's House of Games quiz show, featuring, imporantly, the development of a software environment that assists in the setting of the sorts of questions the quiz show uses.

Keywords: Web Development, Video Streaming, Human Computer Interaction (HCI)

One way in to the question setting side of this project is to source the questions from Linked Open Data sources such as Wikidata, using Semantic Web technologies such as SPARQL to access and manipulate data/knowledge in a machine-readable way. See Useful Links below.

There is also the development of playing a round of the quiz to consider as well. I think it would be fun to recreate the look and feel of the show, which includes the graphics and signature music/audio.

As captured by the name of the TV show, Richard Osman is the host of the show. A critical decision to make early on in this project is whether or not the computerized version would retain this, or if the game-play can suffice without.

A way that the project can go beyond what gets compiled as a TV episode of the show is that the resources that are drawn upon to produce the questions can be targetting to the age range of the users who are going to compete. For example well known song selections from the 80s if the group playing are of my generation. More recent songs if closer to your age! To assist with the software short-listing songs that are well known, this could be accomplished if the songs being selected were linked to data such as number of record sales, or else chart position.

Useful link(s):

Sample Episode
Compilation of the show's signature round: Answer Smash
YouTube channel featuring copious amount of episodes
An example based introduction to liniked data
VizQuery has neat example of retrieving of pictures by self-portraits by Van Gogh at the VG Museum in Amsterdam: Click on Show example queries then select the Van Gogh example
DataViz demonstrates how to take the linked data retrieved and transform it into a visualization.

FaceTime ⮕ ♠AceTime♠

Project Manager: Shriha Deo
Team Members: Jackson Brough; Khalid Mohamed; Paalav Naicker; Jigar Solanki and Yu Liang Ang
Meeting Time: Wednesdays 2-3pm, TC.4.15

Key Idea: Extend video calling into a shared interactive space that allows people to stay connected through playing card games together at a distance.

Keywords: WebRTC; WebSockets; Client-Server Software Architecture; Web Development.

Video calling is good for conversation, but it does not always provide the same sense of shared presence as doing something together in the same room. This project explores how a FaceTime-like application could be extended to support shared activities, using card play as the motivating example. The underlying idea is to create a richer form of remote connection, where talking and interacting happen side by side.

At its simplest, the system would combine live video communication with a shared on-screen space that allows two people to play a card game together. The aim is not merely to reproduce an existing card game app, but to explore how such an activity can be integrated naturally into a video-calling environment. Questions of layout, turn-taking, privacy of each player’s hand, and the feel of manipulating cards all become important aspects of the design.

The project naturally supports a range of technical roles within a team:

One strand can focus on video communication and synchronisation, working with browser-based audio/video technologies and ensuring that the shared environment updates consistently for both participants.
Another strand can focus on shared game-state management, including how cards are dealt, moved, revealed, hidden, and synchronised across both ends of the call.
A third strand can focus on interface and interaction design, exploring how the shared card table should appear, how players interact with cards, and how the video and gameplay elements are balanced on screen.
A further strand can focus on game logic and extensibility, considering how the environment supports not just one hard-wired card game, but potentially a wider range of games.

A straightforward implementation path would be to begin with a single well-defined card game in a shared screen area. This would provide a solid base for exploring the core challenges of video integration, shared interaction, and game-state synchronisation. Beyond this, a more ambitious direction is to make the shared interaction area more general and adaptable.

The stretch target for the project is to investigate whether this shared space could be driven by LLM-generated code. In this version, rather than the application only supporting a fixed built-in game, a user might say “Let’s play Gin Rummy” or “Let’s play Hearts”, and the system would generate or adapt the interface and interaction logic needed for that game within the shared area of the call. This would shift the project from being a specialised card-game application towards a more general environment for shared remote play.

Given the ambition of this stretch goal, it is important that the project plan includes staged milestones. A first milestone might be a FaceTime-like environment with a shared table for a single card game. A next stage could add support for multiple predefined games. Only after this foundation is in place would it make sense to experiment with dynamically generated interfaces or rules driven by language models.

The goal of this project is to explore how video communication can be enriched through shared activity, and to investigate whether modern AI techniques can help make such environments more flexible, personal, and expressive.

Potentially useful links:

Building a Video Chat App with Node.js + Socket.io + WebRTC (Note this is only one of many you can find online)
I Built a Real-Time Chat App with WebRTC — No Server Required – For example, this one that goes into more technical detail on the WebRTC side of things, although for this Smoke and Mirrors project you will make your life immensely easier by having a server!
WebRTC – The official word on real-time video and audio communication in the browser
Socket.IO – synchronising shared state between participants
PeerJS – simplified peer-to-peer communication
React – building interactive browser interfaces
OpenAI API – language models for generating rules or interface code
OpenAI API – for the dynamically configurable card playing aspect to the projet, the Model Context Protocol, and an Agentic-based software architecture are likely strong elements to build upon

Thoughts on gettting started:

For the planning stage of this project, I would charactise the software to be developed as something that functions a lot like Zoom where the role of Share Screen is a more specialised feature that allows for the interactive card playing feature.
Make the investment in learning a mobile friendly web-stack such as React Native.
Build up a general set of resources for playing card games.

The 3D Museum Extension

Project Manager: Kai Meiklejohn
Team Members: Ivan Roberts; Kimera Talbot-Chinula; Risakee Gunasekera; and Khalill Marsh
Meeting Time: Wednesdays 2-3pm, TC.4.15

Key Idea: Extend a museum visit into the classroom by capturing a 360° digital version of the space, enabling students to revisit exhibits virtually and engage with linked content and learning activities.

Keywords: Multimedia Systems; Web Development; Digital Libraries; Human-Computer Interaction.

Imagine taking a class on a museum trip. As you walk through the space, you record the experience using a 360° camera—either as continuous video or as a sequence of panoramic images taken every few metres.

Now imagine returning to the classroom the next day.

Instead of handing out worksheets, you give the students access to a virtual version of the museum they just visited. They can look around, move through the space, and—crucially—they have to find things. When they locate a particular artefact, it becomes interactive. Selecting it opens up further information: text, images, perhaps video—drawn from an underlying digital collection.

And this is where your project begins.

Museum visits provide rich, engaging learning experiences, but they are often limited to a single moment in time. This project explores how a class visit to a museum can be extended into an ongoing, interactive learning resource that students can revisit and explore at their own pace.

The starting point for the project is the capture of the museum environment. Using a 360-degree camera, students can document the layout of the space and the exhibits within it. This captured material then forms the basis of a virtual environment that can be navigated back in the classroom.

Within this environment, students can explore the museum by looking around and moving between locations. A key feature is the ability to interact with exhibits: when a student identifies an object of interest, selecting it reveals additional information drawn from external sources, such as a digital library or curated online content. This creates a layered learning experience, where physical exploration is combined with deeper, information-driven inquiry.

A particularly interesting direction would be for this content to be drawn dynamically from a digital library system such as Greenstone. Rather than manually embedding all the supporting information into the virtual museum itself, the interactive hotspots could connect through to collection records, images, descriptive metadata, and related material that already exist in the digital library. This would make the system more realistic, more scalable, and more in keeping with how museum and archival collections are often managed in practice.

The project also opens up opportunities for structured learning activities. For example, students could be set tasks that require them to locate specific artefacts within the virtual space, answer questions based on the associated information, or compare items across different parts of the museum. In this way, the system supports both open-ended exploration and guided learning.

To hit the mark on this project, it is useful to think in terms of different strands of development within your team:

Capture and media processing: Working with 360-degree images or video, managing large media files, and preparing them for use in a web-based environment.
Navigation and interaction design: Developing the interface that allows users to move through the virtual space and interact with exhibits. This includes deciding how movement works and how interactive elements are presented.
Content integration: Linking points within the virtual environment to external resources such as digital library collections, webpages, or structured datasets. In a stronger version of the project, this could involve retrieving information dynamically from a backend system rather than relying on static content.
Learning activity design: Creating tasks, questions, and workflows that guide students through the environment and support educational outcomes.

A practical development approach would be to begin with a small number of 360-degree images connected in a simple navigation structure, with clickable hotspots that reveal additional information. From this foundation, the system can be extended with richer media, more sophisticated navigation, and more structured learning activities.

Note: A interesting algorithm that could play a role in this project is: VSLAM. This was used in a project to great effect in a recent research project in the department. The accompanying report does a lot of the heavy lifting in terms of how it can be applied to an environment such as a museum. Come talk to me, and I'll dig out the report!

Potentially useful links:

Insta360 – 360-degree camera technology
A-Frame – web framework for building 360°/VR experiences
Greenstone Digital Library – linking to structured collections
Marzipano – lightweight 360° media viewer
Pannellum – open-source panorama viewer
Three.js – 3D and panoramic rendering in the browser

Thoughts on gettting started:

Spend some time learning about the Insta360 camera, and how as a user you can shoot footage, copy it off the camera and then experience it as a 360 degree video.
While you might not end up using the SDK that the manufacturer produces to interface with the Insta360, it is likely worth spending some time learning about what features it provides.
Download, install and do some of the tutorial exercises for Greenstone3, to get a sense of the sorts of content it can store.
Make sure to spend time understand how to control the "Document View" of Greenstone. Digital Library software traditionally provides a passive (static) view of a document: here's the PDF—feel free to read it, download load it. Effectively over to you. However, it doesn't have to be that way. Within the Digital Library there is nothing stopping the document displayed being a dynamically control activity. Connect hotspots in the 3D video (somehow!) to specific document/activities in the digital library, and you have the core ability to the project.
In terms of musem related content, unfortunately there is no instantly ready-for-purpose content that I know of. So an early stage development for this project is to determine a pratical strategy for how to go about this that is not a huge time-sink.

COMPX241 Emerging Project Ideas

Frisbee Golf, AoE

Te 0AD

Zork: The Illustrated Trilogy

Operation Anatomy Explorer: In all Shapes and Sizes

Debate Me!

The Mood of the Nation

Pub Quiz Buster!

David's Take on Richard's House of Games

FaceTime ⮕ ♠AceTime♠

The 3D Museum Extension

Smoke and Mirror Projects: From the Vaults