Posts by Collection

projects

getPaid

Published: November 01, 2022

Abstract:

Created a SparkML RandomForest model to predict total employee compensation. Queried data with SparkSQL, ran PySpark scripts to run EDA, pre-process data, and train model achieving with 0.98 R2 score.

Citation:

Cancer Detection as a Microservice

Published: December 01, 2022

Abstract:

Trained a Voting Classifier to predict cancer type with the Wisconsin Breast Cancer Dataset. Hosted the model as a Flask-based microservice running within a Docker container.

Citation:

Product Shipping Status Classifier

Published: December 01, 2022

Abstract:

Created a RandomForest model to predict issues in the product supplychain which might delay product shipment to customers. Employed PCA and LDA optimization for visualization and dimension reduction. RandomForest model achieved accuracy of 97% (with PCA) and 92% (with LDA).

Citation:

Interpretable Parking Ticket Location Classifier

Published: January 01, 2023

Abstract:

Built a SparkML RandomForest classifier to predict the police-precinct of ticketed vehicles using the NYC parking-ticket dataset. Used PySpark for parallel computation. Achieved 98% accuracy, and identified police precincts with anomalous ticketing practices.

Citation:

AI Alignment in Criminal Justice - Fair Jury Selection and Deliberation

Published: December 04, 2024

Abstract:

Created a theoretical hybrid model for fair jury selection and intra-jury deliberation using optimization and game theory with a novel extension to nash equilibrium.

Citation:

LLM-as-a-SupremeCourt-Judge

Published: December 08, 2024

Abstract:

Developed an AI framework with LLMs (GPT API, fine-tuning, chain-of-thought reasoning) to evaluate 200+ Supreme Court cases, revealing how factors such as panel size, prompting strategies, and ideology personas influence alignment between LLM reasoning and human judges.

Citation:

Detecting Deception: Intelligent Systems for Fighting Misinformation

Published: May 01, 2025

Abstract:

Designed a logic-gated fact-checking system integrating RoBERTa classifiers with contradiction detection, validated on 7,200+ claims. Improved explainability and reliability by maintaining 83% accuracy while enabling selective overrides for low-confidence predictions.

Citation:

Beyond Hallucinations: A Multi-Agent Approach To Fact-Checked AI Knowledge

Published: May 05, 2025

Abstract:

Developed a theoretical framework consisting of multiple LLM agents producing verifiable responses through debate and consensus.

Citation:

The Sound of Suffrage - Modeling Gender and Power in Parliamentary Speech

Published: August 01, 2025

Abstract:

Developing NLP pipeline over 200 years of Hansard debates (1803–2005) using LLMs to study gender bias, framing, and speaker dynamics. Applying representation analysis and framing detection to uncover temporal shifts in discourse, generating insights for bias and fairness in AI systems.

Citation:

Mandira Sawkar

Posts by Collection

projects