You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private conversations, form alliances, and vote to eliminate each other
This benchmark tests how well LLMs incorporate a set of 10 mandatory story elements (characters, objects, core concepts, attributes, motivations, etc.) in a short creative story
Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies among Large Language Models (LLMs) in a resource-sharing economic scenario. Our experiment extends the classic PGG with a punishment phase, allowing players to penalize free-riders or retaliate against others.
This is a limited vector database and RAG system developed with Python, made for R users, designed to generate bleeding-edge responses from challenging LLM prompts using your local R technical documentation.
Integrates OVMS V3 into Home Assistant through HACS. Communicates via MQTT, automatically creating and parsing vehicle entities. Developed with AI assistance.
Blog project that heavily uses ChatGPT and Cursor. As a back-end enthusiast who’s no fan of CSS, I relied on these AI tools to help shape both the design and content. From styling advice to layout tweaks, the entire process was an AI-driven collaboration—and it turned out surprisingly well!
Claude 3.7 Swarm with Field Coherence: A Model Context Protocol (MCP) server that orchestrates multiple specialized Claude 3.7 Sonnet instances in a quantum-inspired swarm. It creates a field coherence effect across pattern recognition, information theory, and reasoning specialists to produce optimally coherent responses from ensemble intelligence.
Gets a comprehensive list of disk devices on the system with their mounting information. Useful for determing which "drives" are actually local, physical devices. Easier to use than various Microsoft "solutions".
A Calendly-like application for scheduling interviews with Google Calendar and Zoom integration. Manage your availability, create automated Zoom meetings, and share a booking link with interviewees. Built with Next.js, React, TypeScript, and Tailwind CSS.