ARES — Autonomous Research & Evolution System

What are Experiments? ARES reads research papers from arXiv and similar sources, then tries to reproduce or verify the core technique described in each paper by writing and running real code. Each row below is one of those attempts. A Success means ARES ran the code and it produced meaningful results. A Failed status means the code ran but hit an error — ARES can retry those. Click View on any row to see the full output, logs, and generated code.

Experiment Summary

Total Experiments

9578

All papers ARES has attempted to reproduce.

Successful

9459

Code ran and produced valid results.

Failed

Errors during execution — eligible for retry.

Pending / Running

Queued or currently in progress.

System Mode

IDLE

Is ARES actively running experiments?

Worker

Python skill training cycle

Background process status.

All Experiments

Each row is one paper ARES has tried to reproduce. Click View to see the generated code, results, and logs.

Experiment / Paper	Topic / Summary	Created	Status	Error	Actions
exp_pytrain.20260522223131.031_20260522_223248 Paper: pytrain.20260522223131.031	Python Skill Fallback Title: Building a Type-Aware Package with Data Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 22:33	Success	-	View
exp_pytrain.20260522213303.030_20260522_213436 Paper: pytrain.20260522213303.030	Python Skill Fallback Title: Creating a Type-safe Asyncio Service - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 21:35	Success	-	View
exp_pytrain.20260522203336.029_20260522_203519 Paper: pytrain.20260522203336.029	Python Skill Fallback Title: Creating a Type-Annotated and Packaged Python Application - Focus: Python Standard Library, Type Annotations, Packaging with setuptools - Note: Generated fallback due to unavailable model output.	05-22 20:36	Success	-	View
exp_pytrain.20260522193012.028_20260522_193159 Paper: pytrain.20260522193012.028	Python Skill Fallback Title: Creating a Python Package with Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 19:33	Success	-	View
exp_pytrain.20260522182955.027_20260522_183110 Paper: pytrain.20260522182955.027	Python Skill Fallback Title: Creating a Type-Safe and Packaged Python Tool - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 18:32	Success	-	View
exp_pytrain.20260522172748.026_20260522_172906 Paper: pytrain.20260522172748.026	Python Skill Fallback Title: Packaging a Python Type-Hinted Library - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 17:30	Success	-	View
exp_pytrain.20260522162527.025_20260522_162732 Paper: pytrain.20260522162527.025	This Python code is a simplified validation tool targeting Python package files for type annotations and setup configura... README.md Description: The `benchmark.py` script provides a basic benchmark test for a Python project validation tool against predefined rules focusing on typing correctness (PEP 484) and following the package specifications guidelines (PEP...	05-22 16:28	Success	-	View
exp_pytrain.20260522152337.024_20260522_152502 Paper: pytrain.20260522152337.024	Python Skill Fallback Title: Type-Safe Python Package Manager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 15:26	Success	-	View
exp_pytrain.20260522142445.023_20260522_142646 Paper: pytrain.20260522142445.023	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 14:27	Success	-	View
exp_pytrain.20260522132826.022_20260522_132936 Paper: pytrain.20260522132826.022	This is a benchmark for running a synthetic workload involving type annotations as per PEP 695 in a package structure. T... To run the benchmark: 1. Ensure Python version >=3.9.0, which supports PEP 585. 2. Execute `python benchmark.py` from this directory. Expected Output: The script concludes with either VERIFIED: indicating a successful verification or **...	05-22 13:30	Success	-	View
exp_pytrain.20260522122304.021_20260522_122418 Paper: pytrain.20260522122304.021	Use type hints to create a utility function that calculates memory usage based on data size parameters. Ensure the imple... Benchmark the runtime performance and report results clearly as required including a PASS/FAIL statement. The self-checks should include various edge cases like null input types and boundary values ensuring reliability. ```python import sys...	05-22 12:25	Success	-	View
exp_pytrain.20260522112344.020_20260522_112512 Paper: pytrain.20260522112344.020	Python Skill Fallback Title: Package FlashAttention with Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 11:26	Success	-	View
exp_pytrain.20260522102248.019_20260522_102412 Paper: pytrain.20260522102248.019	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 10:25	Success	-	View
exp_pytrain.20260522092345.018_20260522_092522 Paper: pytrain.20260522092345.018	Python Skill Fallback Title: Building a Type-Checked Logging Package - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 09:26	Success	-	View
exp_pytrain.20260522082116.017_20260522_082236 Paper: pytrain.20260522082116.017	Python Skill Fallback Title: Automated Python Package Version Checker - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 08:23	Success	-	View
exp_pytrain.20260522072222.016_20260522_072346 Paper: pytrain.20260522072222.016	Python Skill Fallback Title: Create a Robust Package Manager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 07:24	Success	-	View
exp_pytrain.20260522062612.015_20260522_062749 Paper: pytrain.20260522062612.015	Python Skill Fallback Title: Enhance Functionality with Type Hinting and Packaging - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 06:28	Success	-	View
exp_pytrain.20260522052423.014_20260522_052554 Paper: pytrain.20260522052423.014	Python Skill Fallback Title: Type-safe Python Packaging - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 05:26	Success	-	View
exp_pytrain.20260522042548.013_20260522_042655 Paper: pytrain.20260522042548.013	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 04:27	Success	-	View
exp_pytrain.20260522033035.012_20260522_033208 Paper: pytrain.20260522033035.012	Introduction The 'simple_calculator' Python package is designed to perform basic mathematical operations such as addition, subtraction, multiplication, and division with robust type annotations for enhanced maintainability and testability. This reposito...	05-22 03:33	Success	-	View
exp_pytrain.20260522023152.011_20260522_023333 Paper: pytrain.20260522023152.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 02:34	Success	-	View
exp_pytrain.20260522013350.010_20260522_013454 Paper: pytrain.20260522013350.010	Python Skill Fallback Title: Asynchronous Function with Type Hints - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 01:35	Success	-	View
exp_pytrain.20260522003827.009_20260522_003957 Paper: pytrain.20260522003827.009	Python Skill Fallback Title: Build and Test an Autodoc Module with Type Hints - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-22 00:40	Success	-	View
exp_pytrain.20260521233708.008_20260521_233844 Paper: pytrain.20260521233708.008	Python Skill Fallback Title: Creating a Typable Python Library with a Setup Script - Focus: typing, package_management - Note: Generated fallback due to unavailable model output.	05-21 23:39	Success	-	View
exp_pytrain.20260521223747.007_20260521_223859 Paper: pytrain.20260521223747.007	Type-Safe Tensor Operations and Package Distribution Introduction: The 'tensor_ops' Python package provides a type-safe interface for basic tensor operations using PyTorch, aimed at improving maintainability, testability, and robustness. This package includes unit tests and documentation, ens...	05-21 22:40	Success	-	View
exp_pytrain.20260521213524.006_20260521_213704 Paper: pytrain.20260521213524.006	Python Skill Fallback Title: Creating a Python Package with Type Annotations - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 21:38	Success	-	View
exp_pytrain.20260521203204.005_20260521_203354 Paper: pytrain.20260521203204.005	Python Skill Fallback Title: Type Hinted Package Manager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 20:34	Success	-	View
exp_pytrain.20260521193422.004_20260521_193627 Paper: pytrain.20260521193422.004	Python Skill Fallback Title: Building a Type-Safe Package Manager - Focus: {'name': 'Type Hints', 'details': ['Use, {'name': 'Packaging Standards', 'details - Note: Generated fallback due to unavailable model output.	05-21 19:37	Success	-	View
exp_pytrain.20260521183552.003_20260521_183731 Paper: pytrain.20260521183552.003	Python Skill Fallback Title: Build an Asynchronous Python Package with Type Annotations - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 18:38	Success	-	View
exp_pytrain.20260521173557.002_20260521_173707 Paper: pytrain.20260521173557.002	Building a Type-Aware Package with Packaging Utilities Introduction: This exercise involves creating a Python package that utilizes type hints as per PEP 695 standards and modern packaging techniques such as poetry. The primary goal is to ensure robustness and maintainability through static typ...	05-21 17:38	Success	-	View
exp_pytrain.20260521163544.001_20260521_163653 Paper: pytrain.20260521163544.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 16:37	Success	-	View
exp_pytrain.20260521155830.001_20260521_160020 Paper: pytrain.20260521155830.001	The objective is to design a Python module that leverages advanced features provided by the `typing` library, including... Readme The objective is to design a Python module that leverages advanced features provided by the `typing` library, including generics and callable objects. The module shall be packaged into a standalone package ensuring all modules functi...	05-21 16:00	Pending	-	View
exp_pytrain.20260521145507.050_20260521_145625 Paper: pytrain.20260521145507.050	Asynchronous Task Executor with Type Annotations This Python coding drill benchmarks the performance of an asynchronous task executor that uses type hints for improved readability, maintainability, and code robustness. Problem Description Create a module `async_task_manager.py` which shou...	05-21 14:57	Success	-	View
exp_pytrain.20260521134739.049_20260521_134904 Paper: pytrain.20260521134739.049	Python Skill Fallback Title: Create Typing-Aware Package with PyTorch - Focus: Python typing module and mypy static typ, Integration of third-party stub files fo - Note: Generated fallback due to unavailable model output.	05-21 13:50	Success	-	View
exp_pytrain.20260521124119.048_20260521_124243 Paper: pytrain.20260521124119.048	The `math_operations` Python package is designed to provide a robust framework for performing basic and advanced mathema... Installation You can install the package using pip: Requirements - Python >= 3.6 for proper type hinting support. - Familiarity with PEP 484 and dynamic typing in Python. Contributing Guidelines Contributions are welcome! Please ensure all...	05-21 12:43	Success	-	View
exp_pytrain.20260521113640.047_20260521_113839 Paper: pytrain.20260521113640.047	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 11:39	Success	-	View
exp_pytrain.20260521103200.046_20260521_103345 Paper: pytrain.20260521103200.046	Python Skill Fallback Title: Create a Python Package with Type Checking - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 10:34	Success	-	View
exp_pytrain.20260521091947.045_20260521_092105 Paper: pytrain.20260521091947.045	Python Skill Fallback Title: Creating a Python Package with Type Hints - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 09:22	Success	-	View
exp_pytrain.20260521081346.044_20260521_081525 Paper: pytrain.20260521081346.044	Python Skill Fallback Title: Type-Safe Module Loader - Focus: {'topic_name': 'Type Hinting', 'descript, {'topic_name': 'Packaging', 'description - Note: Generated fallback due to unavailable model output.	05-21 08:16	Success	-	View
exp_pytrain.20260521071110.043_20260521_071243 Paper: pytrain.20260521071110.043	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 07:13	Success	-	View
exp_pytrain.20260521060645.042_20260521_060823 Paper: pytrain.20260521060645.042	Python Skill Fallback Title: Type Parameter Syntax in Package - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 06:09	Success	-	View
exp_pytrain.20260521050130.041_20260521_050322 Paper: pytrain.20260521050130.041	Python Skill Fallback Title: Type-Aware Package Distribution - Focus: typing.NewType for creating distinct typ, dataclasses, namedtuples, or simple clas, PEP 484 guidelines for type hinting synt, using typing.FileIO and other I/O types, constructing a setup.py t...	05-21 05:04	Success	-	View
exp_pytrain.20260521040141.040_20260521_040327 Paper: pytrain.20260521040141.040	Type-Aware Packaging for Python Scripts Problem Statement: Using type hints and proper packaging can significantly enhance the maintainability, readability, and testability of a Python project. The objective is to design a small utility script inspired by FlashAttention that inco...	05-21 04:04	Success	-	View
exp_pytrain.20260521025909.039_20260521_030110 Paper: pytrain.20260521025909.039	Packaging a Python Project with Type Annotations Goal: Create a complete Python project that includes setup for packaging, type annotations using mypy types, and ensures all modules are testable via pytest. Requirements: - `pytest` for testing. - Installed Python >= 3.7 (to support ty...	05-21 03:02	Success	-	View
exp_pytrain.20260521015653.038_20260521_015817 Paper: pytrain.20260521015653.038	Python Skill Fallback Title: Type Annotated Package Manager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 01:59	Success	-	View
exp_pytrain.20260521005235.037_20260521_005349 Paper: pytrain.20260521005235.037	Python Skill Fallback Title: Create a Python Package with Type Hints - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-21 00:54	Success	-	View
exp_pytrain.20260520235029.036_20260520_235148 Paper: pytrain.20260520235029.036	Python Skill Fallback Title: Python Package with Type Annotations - Focus: Python typing library (PEP 483/484), packaging a Python module with setup too - Note: Generated fallback due to unavailable model output.	05-20 23:52	Success	-	View
exp_pytrain.20260520224840.035_20260520_224955 Paper: pytrain.20260520224840.035	Building a Robust Typing and Packaging System for a Python Module Objective: Write robust, reusable Python code that includes comprehensive type annotations following the PEP 484 guidelines. Ensure that the module is properly organized and packaged using `python setup.py` or similar packaging tools. Metri...	05-20 22:50	Success	-	View
exp_pytrain.20260520214610.034_20260520_214746 Paper: pytrain.20260520214610.034	Python Skill Fallback Title: Build a Robust Python Project - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 21:48	Success	-	View
exp_pytrain.20260520204029.033_20260520_204225 Paper: pytrain.20260520204029.033	Python Skill Fallback Title: Module Packaging and Type Checking - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 20:43	Success	-	View
exp_pytrain.20260520193832.032_20260520_193946 Paper: pytrain.20260520193832.032	Python Skill Fallback Title: Creating a Python Package with Type Annotations - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 19:40	Success	-	View
exp_pytrain.20260520183336.031_20260520_183556 Paper: pytrain.20260520183336.031	Python Skill Fallback Title: Creating a Type-Safe CLI Tool - Focus: {'topic': 'typing', 'description': "Use, {'topic': 'packaging', 'description': "B - Note: Generated fallback due to unavailable model output.	05-20 18:36	Success	-	View
exp_pytrain.20260520172436.030_20260520_172657 Paper: pytrain.20260520172436.030	Python Skill Fallback Title: Creating a Type-safe, Asynchronous Task Manager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 17:27	Success	-	View
exp_pytrain.20260520162348.029_20260520_162441 Paper: pytrain.20260520162348.029	Python Skill Fallback Title: Creating a Type-Safe Packaging Utility - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 16:25	Success	-	View
exp_pytrain.20260520152324.028_20260520_152501 Paper: pytrain.20260520152324.028	Python Skill Fallback Title: Packaging a Typing-Friendly Python App - Focus: {'topic': 'typing', 'resources': ['https, {'topic': 'packaging', 'resources': ['ht - Note: Generated fallback due to unavailable model output.	05-20 15:26	Success	-	View
exp_pytrain.20260520141842.027_20260520_142015 Paper: pytrain.20260520141842.027	Python Skill Fallback Title: Package a Python Project with Type Annotations - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 14:21	Success	-	View
exp_pytrain.20260520131642.026_20260520_131804 Paper: pytrain.20260520131642.026	Python Skill Fallback Title: Creating a Python Package with Typed Data Classes - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 13:19	Success	-	View
exp_pytrain.20260520120941.025_20260520_121202 Paper: pytrain.20260520120941.025	Python Skill Fallback Title: Python Package Enhancer - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 12:13	Success	-	View
exp_pytrain.20260520110253.024_20260520_110431 Paper: pytrain.20260520110253.024	Python Skill Fallback Title: Building a Basic Python Package with Type Annotations - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 11:05	Success	-	View
exp_pytrain.20260520100410.023_20260520_100539 Paper: pytrain.20260520100410.023	Python Skill Fallback Title: Packaging Asynchronous Python Application - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 10:06	Success	-	View
exp_pytrain.20260520090106.022_20260520_090222 Paper: pytrain.20260520090106.022	Python Skill Fallback Title: Type Annotations for Package Initialization - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 09:03	Success	-	View
exp_pytrain.20260520075854.021_20260520_080009 Paper: pytrain.20260520075854.021	Python Skill Fallback Title: Creating a Typing-Aware Package - Focus: Python stdlib.typing, Pep484 - Type Hints, Python Packaging User Guide - Note: Generated fallback due to unavailable model output.	05-20 08:01	Success	-	View
exp_pytrain.20260520065312.020_20260520_065440 Paper: pytrain.20260520065312.020	Python Skill Fallback Title: Create a Python Package for Robust Numerical Computation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 06:55	Success	-	View
exp_pytrain.20260520054850.019_20260520_055008 Paper: pytrain.20260520054850.019	Python Skill Fallback Title: Develop a Python Package with Type Annotations and Packaging Standards - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 05:51	Success	-	View
exp_pytrain.20260520044911.018_20260520_045049 Paper: pytrain.20260520044911.018	This drill focuses on implementing a utility that heavily leverages Python's type system. It emphasizes reliability thro... Performance benchmarking involves measuring execution speed and memory usage while ensuring the code operates correctly even with unconventional or extreme inputs. README.md Python Reliability Drill: Typing Implemented a type-safe Python ut...	05-20 04:51	Success	-	View
exp_pytrain.20260520034229.017_20260520_034428 Paper: pytrain.20260520034229.017	Python Skill Fallback Title: Type-annotated Python Package for Handling Files - Focus: Python stdlib, typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 03:45	Success	-	View
exp_pytrain.20260520023518.016_20260520_023706 Paper: pytrain.20260520023518.016	Python Skill Fallback Title: Creating a Robust Configuration Handler - Focus: {'description': "Use Python's typing fea, {'description': 'Learn how to properly p - Note: Generated fallback due to unavailable model output.	05-20 02:38	Success	-	View
exp_pytrain.20260520013200.015_20260520_013330 Paper: pytrain.20260520013200.015	Python Skill Fallback Title: Construct a Type-Full CLI Tool - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 01:34	Success	-	View
exp_pytrain.20260520002712.014_20260520_002840 Paper: pytrain.20260520002712.014	Python Skill Fallback Title: Creating a Python Package for Type Checking - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-20 00:29	Success	-	View
exp_pytrain.20260519232636.013_20260519_232747 Paper: pytrain.20260519232636.013	Python Skill Fallback Title: Build and Test a Python Package - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 23:28	Success	-	View
exp_pytrain.20260519221956.012_20260519_222159 Paper: pytrain.20260519221956.012	This Python coding drill benchmark aims to develop a type-safe package for text analysis functionalities such as tokeniz... Setup Instructions Before you start: 1. Clone the repository or download it. 2. Make sure Python 3.x is installed on your system. 3. The benchmark does not require any external dependencies beyond Python's standard library. Goal Create a ru...	05-19 22:23	Success	-	View
exp_pytrain.20260519211602.011_20260519_211808 Paper: pytrain.20260519211602.011	Python Skill Fallback Title: Python Module Packaging with Type Annotations - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 21:19	Success	-	View
exp_pytrain.20260519201018.010_20260519_201134 Paper: pytrain.20260519201018.010	Python Skill Fallback Title: Creating an Asynchronous Package for Logging - Focus: {'topic': 'typing', 'description': "Use, {'topic': 'packaging', 'description': 'S - Note: Generated fallback due to unavailable model output.	05-19 20:12	Success	-	View
exp_pytrain.20260519190707.009_20260519_190827 Paper: pytrain.20260519190707.009	Python Skill Fallback Title: Creating a Robust Typing Package - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 19:09	Success	-	View
exp_pytrain.20260519175827.008_20260519_180029 Paper: pytrain.20260519175827.008	Python Skill Fallback Title: Creating a Python Package with Advanced Typings - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 18:01	Success	-	View
exp_pytrain.20260519165353.007_20260519_165519 Paper: pytrain.20260519165353.007	This benchmark is a Python coding drill that assesses reliable and robust utility implementation focusing on typing feat... To execute this benchmark, follow these steps: 1. Ensure your environment meets Python's standard library requirements. 2. Clone or download the script `benchmark.py`. 3. Run the benchmark by executing `python benchmark.py` in your terminal...	05-19 16:56	Success	-	View
exp_pytrain.20260519155303.006_20260519_155420 Paper: pytrain.20260519155303.006	Python Skill Fallback Title: Creating a Python Package with Typed Function Definitions - Focus: type hinting, module design, unit testing with hypothesis or pytest, creating packaging for Python scripts - Note: Generated fallback due to unavailable model output.	05-19 15:55	Success	-	View
exp_pytrain.20260519145335.005_20260519_145459 Paper: pytrain.20260519145335.005	Python Skill Fallback Title: Creating a Robust Python Package with Type Annotations - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 14:56	Success	-	View
exp_pytrain.20260519135420.004_20260519_135654 Paper: pytrain.20260519135420.004	Python Skill Fallback Title: Type-annotated CLI Tool - Focus: {'topic': 'type hinting', 'details': 'An, {'topic': 'argparse', 'details': 'Use ar, {'topic': 'setuptools', 'details': 'Pack - Note: Generated fallback due to unavailable model output.	05-19 13:57	Success	-	View
exp_pytrain.20260519125436.003_20260519_125608 Paper: pytrain.20260519125436.003	Python Skill Fallback Title: Building a Type-Safe and Packagable Async Scraper - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 12:57	Success	-	View
exp_pytrain.20260519115336.002_20260519_115509 Paper: pytrain.20260519115336.002	Python Skill Fallback Title: Generic Function with Constraint - Focus: type parameter syntax, parameter constraints, package creation - Note: Generated fallback due to unavailable model output.	05-19 11:56	Success	-	View
exp_pytrain.20260519105116.001_20260519_105257 Paper: pytrain.20260519105116.001	Python Skill Fallback Title: Creating a Robust CLI Tool with Typing and Packaging - Focus: typing.Type, packaging.setup - Note: Generated fallback due to unavailable model output.	05-19 10:53	Success	-	View
exp_pytrain.20260519085632.001_20260519_085836 Paper: pytrain.20260519085632.001	Python Skill Fallback Title: Building a Typing Compliant Python Package - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 08:59	Success	-	View
exp_pytrain.20260519071652.016_20260519_071816 Paper: pytrain.20260519071652.016	Python Skill Fallback Title: Creating a Robust Library with Type Hints - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 07:19	Success	-	View
exp_pytrain.20260519061847.015_20260519_062010 Paper: pytrain.20260519061847.015	Python Skill Fallback Title: Build a Typing and Packaging Benchmark for Python - Focus: PEP 484 (Type Hints), PEP 695 (Type Parameter Syntax), Python Packaging, Mypy Linting Tool - Note: Generated fallback due to unavailable model output.	05-19 06:21	Success	-	View
exp_pytrain.20260519051740.014_20260519_051906 Paper: pytrain.20260519051740.014	Python Skill Fallback Title: Creating a Robust Python Library - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 05:20	Success	-	View
exp_pytrain.20260519041657.013_20260519_041803 Paper: pytrain.20260519041657.013	Python Skill Fallback Title: Creating a Robust Python Package for FlashAttention Implementation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 04:19	Success	-	View
exp_pytrain.20260519031621.012_20260519_031841 Paper: pytrain.20260519031621.012	Python Skill Fallback Title: Python Package with Type Annotations - Focus: Python Standard Library, Package Management with pip/setuptools/d, Type Annotations in Python, Static Type Checking with mypy - Note: Generated fallback due to unavailable model output.	05-19 03:19	Success	-	View
exp_pytrain.20260519020958.011_20260519_021200 Paper: pytrain.20260519020958.011	This directory contains a Python CLI application named `notes_app.py` that helps manage notes stored in JSON files. The... Features include: - Adding notes with title and content. - Listing all notes. - Deleting a specified note. Ensure you run `./notes_app.py --help` for details on each command usage. This application is designed to be compliant with the provi...	05-19 02:13	Success	-	View
exp_pytrain.20260519010522.010_20260519_010701 Paper: pytrain.20260519010522.010	Python Skill Fallback Title: Asynchronous Webhook Handler with Type Annotations - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 01:08	Success	-	View
exp_pytrain.20260519000214.009_20260519_000353 Paper: pytrain.20260519000214.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-19 00:04	Success	-	View
exp_pytrain.20260518230421.008_20260518_230525 Paper: pytrain.20260518230421.008	Python Skill Fallback Title: Creating a Robust Python Package with Type Annotations - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 23:06	Success	-	View
exp_pytrain.20260518215627.007_20260518_215828 Paper: pytrain.20260518215627.007	Python Skill Fallback Title: Type-Driven Development and Packaging for a Calculator Application - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 21:59	Success	-	View
exp_pytrain.20260518205357.006_20260518_205610 Paper: pytrain.20260518205357.006	Experiment Benchmark This experiment contains a runnable benchmark generated by ARES. Files - `benchmark.py`: main benchmark entrypoint - `results.log`: captured runtime output after execution Run Expected Output - `VRAM_USAGE: <value>MB` - `TOKENS_PER_SEC: <va...	05-18 20:57	Success	-	View
exp_pytrain.20260518195229.005_20260518_195342 Paper: pytrain.20260518195229.005	Python Skill Fallback Title: Type-Checked Python Package Generator - Focus: Python typing, Packaging Python projects - Note: Generated fallback due to unavailable model output.	05-18 19:54	Success	-	View
exp_pytrain.20260518184641.004_20260518_184805 Paper: pytrain.20260518184641.004	Python Skill Fallback Title: Building a Configurable Python Module with Typing Enhancements - Focus: {'topic_name': 'Type Hints', 'details':, {'topic_name': 'Python Packaging', 'deta - Note: Generated fallback due to unavailable model output.	05-18 18:49	Success	-	View
exp_pytrain.20260518174539.003_20260518_174730 Paper: pytrain.20260518174539.003	Python Skill Fallback Title: Type-Safe Async Package Manager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 17:48	Success	-	View
exp_pytrain.20260518164552.002_20260518_164730 Paper: pytrain.20260518164552.002	Python Skill Fallback Title: Type-Enhanced Packaging Tools - Focus: {'name': 'Typing', 'details': ['Advanced, {'name': 'Packaging', 'details': ['Creat - Note: Generated fallback due to unavailable model output.	05-18 16:48	Success	-	View
exp_pytrain.20260518153103.001_20260518_153225 Paper: pytrain.20260518153103.001	Python Skill Fallback Title: Creating a Reusable Data Validation Library - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 15:33	Success	-	View
exp_pytrain.20260518140724.001_20260518_140855 Paper: pytrain.20260518140724.001	Python Skill Fallback Title: Develop a Robust Package with PyPI Support - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 14:09	Success	-	View
exp_pytrain.20260518132244.002_20260518_132313 Paper: pytrain.20260518132244.002	Here's the code for the benchmark: No summary available yet.	05-18 13:24	Success	-	View
exp_hf_2605.14786_20260518_131207 Paper: hf_2605.14786	Known By Their Actions: Fingerprinting LLM Browser Agents via UI Traces Paper ID: hf_2605.14786 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 13:13	Success	-	View
exp_pytrain.20260518124827.001_20260518_124856 Paper: pytrain.20260518124827.001	Autonomous Coding Drill: Robust Typing and Packaging ========================================================== Section 1: README.md Section 2: benchmark.py ```python import time from typing import Optional, Union def check_empty_string(s: str) -> bool: if not s: return True # Assuming an...	05-18 12:49	Success	-	View
exp_self.20260518120617.003_20260518_120618 Paper: self.20260518120617.003	Student hypothesis: ssm_mamba + throughput_optimization co-design Paper ID: self.20260518120617.003 - Hypothesis: Combining ssm_mamba + throughput_optimization + distillation will improve throughput or memory efficiency without breaking 8GB execution. - Plan: Create a compact comparative benchmark against...	05-18 12:06	Success	-	View
exp_self.20260518120006.002_20260518_120006 Paper: self.20260518120006.002	Student hypothesis: linear + ssm_mamba co-design Paper ID: self.20260518120006.002 - Hypothesis: Combining linear + ssm_mamba will improve throughput or memory efficiency without breaking 8GB execution. - Plan: Create a compact comparative benchmark against a simple baseline, measure VRAM...	05-18 12:00	Success	-	View
exp_self.20260518115355.001_20260518_115356 Paper: self.20260518115355.001	Student hypothesis: ssm + linear co-design Paper ID: self.20260518115355.001 - Hypothesis: Combining ssm + linear + ssm_mamba will improve throughput or memory efficiency without breaking 8GB execution. - Plan: Create a compact comparative benchmark against a simple baseline, measur...	05-18 11:53	Success	-	View
exp_pytrain.20260518115245.001_20260518_115245 Paper: pytrain.20260518115245.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 11:52	Success	-	View
exp_self.20260518114305.014_20260518_114305 Paper: self.20260518114305.014	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518114305.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 11:43	Success	-	View
exp_self.20260518113637.013_20260518_113637 Paper: self.20260518113637.013	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518113637.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 11:36	Success	-	View
exp_self.20260518112917.012_20260518_112917 Paper: self.20260518112917.012	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518112917.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 11:29	Success	-	View
exp_pytrain.20260518112307.005_20260518_112307 Paper: pytrain.20260518112307.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 11:23	Success	-	View
exp_self.20260518112201.011_20260518_112201 Paper: self.20260518112201.011	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518112201.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 11:22	Success	-	View
exp_self.20260518111517.010_20260518_111518 Paper: self.20260518111517.010	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518111517.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 11:15	Success	-	View
exp_self.20260518110844.009_20260518_110844 Paper: self.20260518110844.009	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518110844.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 11:08	Success	-	View
exp_self.20260518110234.008_20260518_110234 Paper: self.20260518110234.008	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518110234.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 11:02	Success	-	View
exp_self.20260518105559.007_20260518_105559 Paper: self.20260518105559.007	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518105559.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 10:56	Success	-	View
exp_pytrain.20260518105207.004_20260518_105207 Paper: pytrain.20260518105207.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 10:52	Success	-	View
exp_self.20260518104957.006_20260518_104958 Paper: self.20260518104957.006	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518104957.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 10:49	Success	-	View
exp_self.20260518104323.005_20260518_104323 Paper: self.20260518104323.005	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518104323.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 10:43	Success	-	View
exp_hf_2506.01015_20260518_104101 Paper: hf_2506.01015	AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting Paper ID: hf_2506.01015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 10:41	Success	-	View
exp_self.20260518103629.004_20260518_103630 Paper: self.20260518103629.004	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518103629.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 10:36	Success	-	View
exp_oa_W7161354235_20260518_103153 Paper: oa_W7161354235	Negation Neglect: When models fail to learn negations in training Paper ID: oa_W7161354235 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 10:31	Success	-	View
exp_self.20260518103045.003_20260518_103046 Paper: self.20260518103045.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518103045.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 10:30	Success	-	View
exp_self.20260518102409.002_20260518_102409 Paper: self.20260518102409.002	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518102409.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 10:24	Success	-	View
exp_oa_W7161354484_20260518_102212 Paper: oa_W7161354484	Not Just RLHF: Why Alignment Alone Won't Fix Multi-Agent Sycophancy Paper ID: oa_W7161354484 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 10:22	Success	-	View
exp_pytrain.20260518102105.003_20260518_102105 Paper: pytrain.20260518102105.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 10:21	Success	-	View
exp_cr_10.1177_13621688261449335_20260518_101917 Paper: cr_10.1177_13621688261449335	From Strategy Awareness to Engagement: Self-Regulated Learning Strategies-Based Writing Instruction in L2 Essay Developm... Paper ID: cr_10.1177_13621688261449335 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recov...	05-18 10:19	Success	-	View
exp_hf_2605.15597_20260518_101748 Paper: hf_2605.15597	CM-EVS: Sparse Panoramic RGB-D-Pose Data for Complete Scene Coverage Paper ID: hf_2605.15597 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 10:17	Success	-	View
exp_cr_10.56726_irjmets96431_20260518_101627 Paper: cr_10.56726_irjmets96431	HIERARCHALIGN: LINEAR-COMPLEXITY CROSS-MODAL ATTENTION WITH RLHF FOR HUMAN-ALIGNED MULTI-MODAL LARGE LANGUAGE MODELS Paper ID: cr_10.56726_irjmets96431 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 10:16	Success	-	View
exp_hf_2605.15138_20260518_101430 Paper: hf_2605.15138	Forgetting That Sticks: Quantization-Permanent Unlearning via Circuit Attribution Paper ID: hf_2605.15138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 10:14	Success	-	View
exp_cr_10.54254_2753-8818_2026.33701_20260518_101143 Paper: cr_10.54254_2753-8818_2026.33701	Large Language Models in Mental Health: An Investigation of Prompt-Based Approaches, Fine-Tuning and Domain Adaptation,... Paper ID: cr_10.54254_2753-8818_2026.33701 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: R...	05-18 10:11	Success	-	View
exp_self.20260518101034.001_20260518_101035 Paper: self.20260518101034.001	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260518101034.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 10:10	Success	-	View
exp_2303.15564v3_20260518_100854 Paper: 2303.15564v3	Backfill Candidate 2303.15564v3 Fallback synthesis: Mask and Restore: Blind Backdoor Defense at Test Time with Masked Autoencoder. Key signals: rag.	05-18 10:08	Success	-	View
exp_cr_10.3390_electronics12183925_20260518_100836 Paper: cr_10.3390_electronics12183925	Backfill Candidate cr_10.3390_electronics12183925 Fallback synthesis: Multi-Phase Focused PID Adaptive Tuning with Reinforcement Learning. Key signals: rag.	05-18 10:08	Success	-	View
exp_cr_10.51574_ijrer.v5i1.4200_20260518_100818 Paper: cr_10.51574_ijrer.v5i1.4200	Backfill Candidate cr_10.51574_ijrer.v5i1.4200 Fallback synthesis: Think of Pair Share Learning Model on Student Learning Activity in Science Subjects at State Elementary Madrasah. Key signals: rag.	05-18 10:08	Success	-	View
exp_2512.15753v1_20260518_100730 Paper: 2512.15753v1	Backfill Candidate 2512.15753v1 Fallback synthesis: TAO-Net: Two-stage Adaptive OOD Classification Network for Fine-grained Encrypted Traffic Classification. Key signals: rag.	05-18 10:07	Success	-	View
exp_2204.00598v2_20260518_100712 Paper: 2204.00598v2	Backfill Candidate 2204.00598v2 Fallback synthesis: Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language. Key signals: retrieval, rag.	05-18 10:07	Success	-	View
exp_2303.15604v2_20260518_100653 Paper: 2303.15604v2	Backfill Candidate 2303.15604v2 Fallback synthesis: HD-Bind: Encoding of Molecular Structure with Low Precision, Hyperdimensional Binary Representations. Key signals: inference, rag.	05-18 10:06	Success	-	View
exp_2303.15595v2_20260518_100635 Paper: 2303.15595v2	Backfill Candidate 2303.15595v2 Fallback synthesis: Bi-Encoder Cascades for Efficient Image Search. Key signals: retrieval, rag.	05-18 10:06	Success	-	View
exp_2310.03754v1_20260518_100616 Paper: 2310.03754v1	Backfill Candidate 2310.03754v1 Fallback synthesis: EMGTFNet: Fuzzy Vision Transformer to decode Upperlimb sEMG signals for Hand Gestures Recognition. Key signals: sparse, rag.	05-18 10:06	Success	-	View
exp_2406.13847v1_20260518_100558 Paper: 2406.13847v1	Backfill Candidate 2406.13847v1 Fallback synthesis: Locating and measuring marine aquaculture production from space: a computer vision approach in the French Mediterranean. Key signals: sparse, rag.	05-18 10:06	Success	-	View
exp_cr_10.1158_1538-7445.pancreatic24-b066_20260518_100540 Paper: cr_10.1158_1538-7445.pancreatic24-b066	Backfill Candidate cr_10.1158_1538-7445.pancreatic24-b066 Fallback synthesis: Abstract B066: An AI approach to unraveling treatment response in pancreatic cancer: Insights from the COMPASS trial leveraging large language models (LLMs). Key signals: retrieval, rag.	05-18 10:05	Success	-	View
exp_2412.12324v1_20260518_100522 Paper: 2412.12324v1	Backfill Candidate 2412.12324v1 Fallback synthesis: F-RBA: A Federated Learning-based Framework for Risk-based Authentication. Key signals: ssm, rag.	05-18 10:05	Success	-	View
exp_2506.12568v1_20260518_100504 Paper: 2506.12568v1	Backfill Candidate 2506.12568v1 Fallback synthesis: MVP-CBM:Multi-layer Visual Preference-enhanced Concept Bottleneck Model for Explainable Medical Image Classification. Key signals: sparse, rag.	05-18 10:05	Success	-	View
exp_cr_10.5539_elt.v18n7p15_20260518_100446 Paper: cr_10.5539_elt.v18n7p15	Backfill Candidate cr_10.5539_elt.v18n7p15 Fallback synthesis: Enhancing College English Education in China With AI: A Teacher-AI-Student Triad Model. Key signals: context, rag.	05-18 10:04	Success	-	View
exp_2512.11057v1_20260518_100428 Paper: 2512.11057v1	Backfill Candidate 2512.11057v1 Fallback synthesis: Weakly Supervised Tuberculosis Localization in Chest X-rays through Knowledge Distillation. Key signals: rag.	05-18 10:04	Success	-	View
exp_2512.11147v1_20260518_100408 Paper: 2512.11147v1	MiniScope: A Least Privilege Framework for Authorizing Tool Calling Agents Fallback synthesis: MiniScope: A Least Privilege Framework for Authorizing Tool Calling Agents. No strong keyword signals detected.	05-18 10:04	Success	-	View
exp_2506.12594v1_20260518_100319 Paper: 2506.12594v1	A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications Fallback synthesis: A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications. Key signals: retrieval.	05-18 10:03	Success	-	View
exp_2506.12617v3_20260518_100301 Paper: 2506.12617v3	Evaluating AI Alignment in Eleven LLMs through Output-Based Analysis and Human Benchmarking Fallback synthesis: Evaluating AI Alignment in Eleven LLMs through Output-Based Analysis and Human Benchmarking. No strong keyword signals detected.	05-18 10:03	Success	-	View
exp_2412.12351v2_20260518_100243 Paper: 2412.12351v2	Krony-PT: GPT2 compressed with Kronecker Products Fallback synthesis: Krony-PT: GPT2 compressed with Kronecker Products. No strong keyword signals detected.	05-18 10:02	Success	-	View
exp_2303.15621v2_20260518_100225 Paper: 2303.15621v2	ChatGPT as a Factual Inconsistency Evaluator for Text Summarization Fallback synthesis: ChatGPT as a Factual Inconsistency Evaluator for Text Summarization. Key signals: inference.	05-18 10:02	Success	-	View
exp_oa_W7124118447_20260518_100206 Paper: oa_W7124118447	Lost in the Noise: How Reasoning Models Fail with Contextual Distractors Fallback synthesis: Lost in the Noise: How Reasoning Models Fail with Contextual Distractors. Key signals: context, rag.	05-18 10:02	Success	-	View
exp_oa_W7131864980_20260518_100148 Paper: oa_W7131864980	EcoRL-Sched: Energy-Aware Heterogeneous GPU–FPGA Task Scheduling for Sustainable RLHF Training Pipelines Fallback synthesis: EcoRL-Sched: Energy-Aware Heterogeneous GPU–FPGA Task Scheduling for Sustainable RLHF Training Pipelines. Key signals: inference.	05-18 10:01	Success	-	View
exp_oa_W7133571298_20260518_100130 Paper: oa_W7133571298	Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals Fallback synthesis: Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals. Key signals: sparse, context, grounded.	05-18 10:01	Success	-	View
exp_oa_W7134860682_20260518_100112 Paper: oa_W7134860682	DARC: Disagreement-Aware Alignment via Risk-Constrained Decoding Fallback synthesis: DARC: Disagreement-Aware Alignment via Risk-Constrained Decoding. Key signals: inference, rag, rerank.	05-18 10:01	Success	-	View
exp_2512.10955v2_20260518_100054 Paper: 2512.10955v2	Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization Fallback synthesis: Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization. Key signals: context, retrieval, embedding.	05-18 10:00	Success	-	View
exp_2512.11099v1_20260518_100036 Paper: 2512.11099v1	VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction Fallback synthesis: VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction. Key signals: inference, rag.	05-18 10:00	Success	-	View
exp_cr_10.3390_agriculture15242569_20260518_100018 Paper: cr_10.3390_agriculture15242569	Smart Irrigation Scheduling for Crop Production Using a Crop Model and Improved Deep Reinforcement Learning Fallback synthesis: Smart Irrigation Scheduling for Crop Production Using a Crop Model and Improved Deep Reinforcement Learning. Key signals: memory.	05-18 10:00	Success	-	View
exp_2506.12576v2_20260518_100000 Paper: 2506.12576v2	Enabling Precise Topic Alignment in Large Language Models Via Sparse Autoencoders Fallback synthesis: Enabling Precise Topic Alignment in Large Language Models Via Sparse Autoencoders. Key signals: sparse, inference, rag.	05-18 10:00	Success	-	View
exp_2506.12606v2_20260518_095913 Paper: 2506.12606v2	An Exploration of Mamba for Speech Self-Supervised Models Fallback synthesis: An Exploration of Mamba for Speech Self-Supervised Models. Key signals: linear, context, rag.	05-18 09:59	Success	-	View
exp_2506.13814v1_20260518_095854 Paper: 2506.13814v1	ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering Fallback synthesis: ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering. Key signals: inference, rag.	05-18 09:58	Success	-	View
exp_2506.17285v1_20260518_095836 Paper: 2506.17285v1	A Framework for Generating Conversational Recommendation Datasets from Behavioral Interactions Fallback synthesis: A Framework for Generating Conversational Recommendation Datasets from Behavioral Interactions. Key signals: context, grounded.	05-18 09:58	Success	-	View
exp_core_297420785_20260518_095818 Paper: core_297420785	Towards Principled Training and Serving of Large Language Models Fallback synthesis: Towards Principled Training and Serving of Large Language Models. Key signals: inference.	05-18 09:58	Success	-	View
exp_2412.12409v1_20260518_095800 Paper: 2412.12409v1	Improving Cooperation in Language Games with Bayesian Inference and the Cognitive Hierarchy Fallback synthesis: Improving Cooperation in Language Games with Bayesian Inference and the Cognitive Hierarchy. Key signals: inference, rag, embedding.	05-18 09:58	Success	-	View
exp_2406.13809v1_20260518_095741 Paper: 2406.13809v1	Towards Holistic Language-video Representation: the language model-enhanced MSR-Video to Text Dataset Fallback synthesis: Towards Holistic Language-video Representation: the language model-enhanced MSR-Video to Text Dataset. Key signals: context, retrieval.	05-18 09:57	Success	-	View
exp_2406.13858v1_20260518_095723 Paper: 2406.13858v1	Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning Fallback synthesis: Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning. Key signals: linear, inference, embedding.	05-18 09:57	Success	-	View
exp_2406.13885v1_20260518_095705 Paper: 2406.13885v1	Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever Fallback synthesis: Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever. Key signals: context, embedding.	05-18 09:57	Success	-	View
exp_cr_10.3390_app131810379_20260518_095647 Paper: cr_10.3390_app131810379	Novel Paintings from the Latent Diffusion Model through Transfer Learning Fallback synthesis: Novel Paintings from the Latent Diffusion Model through Transfer Learning. Key signals: context, memory.	05-18 09:56	Success	-	View
exp_cr_10.47689_stars.university-pp276-279_20260518_095628 Paper: cr_10.47689_stars.university-pp276-279	Integrating pragmatic competence to english language classes Fallback synthesis: Integrating pragmatic competence to english language classes. Key signals: context, rag.	05-18 09:56	Success	-	View
exp_2303.15569v1_20260518_095610 Paper: 2303.15569v1	Core-Periphery Principle Guided Redesign of Self-Attention in Transformers Fallback synthesis: Core-Periphery Principle Guided Redesign of Self-Attention in Transformers. Key signals: sparse, rag.	05-18 09:56	Success	-	View
exp_2303.15585v4_20260518_095552 Paper: 2303.15585v4	(Un)fair devices: Moving beyond AI accuracy in personal sensing Fallback synthesis: (Un)fair devices: Moving beyond AI accuracy in personal sensing. Key signals: ssm, rag, grounded.	05-18 09:55	Success	-	View
exp_2209.15439v2_20260518_095504 Paper: 2209.15439v2	Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection Fallback synthesis: Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection. Key signals: ssm, context, rag.	05-18 09:55	Success	-	View
exp_cr_10.1609_aaai.v36i11.21480_20260518_095446 Paper: cr_10.1609_aaai.v36i11.21480	PrEF: Probabilistic Electricity Forecasting via Copula-Augmented State Space Model Fallback synthesis: PrEF: Probabilistic Electricity Forecasting via Copula-Augmented State Space Model. Key signals: linear, ssm, inference.	05-18 09:54	Success	-	View
exp_2204.00673v2_20260518_095428 Paper: 2204.00673v2	Learnable latent embeddings for joint behavioral and neural analysis Fallback synthesis: Learnable latent embeddings for joint behavioral and neural analysis. Key signals: linear, rag, embedding.	05-18 09:54	Success	-	View
exp_2204.00707v1_20260518_095410 Paper: 2204.00707v1	Efficient Argument Structure Extraction with Transfer Learning and Active Learning Fallback synthesis: Efficient Argument Structure Extraction with Transfer Learning and Active Learning. Key signals: context, rag.	05-18 09:54	Success	-	View
exp_gh_maursader_symbiote-protocol_20260518_095352 Paper: gh_maursader_symbiote-protocol	maursader/symbiote-protocol Fallback synthesis: maursader/symbiote-protocol. Key signals: memory, rag.	05-18 09:53	Success	-	View
exp_2512.11179v3_20260518_095334 Paper: 2512.11179v3	Bandwidth-constrained Variational Message Encoding for Cooperative Multi-agent Reinforcement Learning Fallback synthesis: Bandwidth-constrained Variational Message Encoding for Cooperative Multi-agent Reinforcement Learning. Key signals: sparse.	05-18 09:53	Success	-	View
exp_cr_10.1038_s41390-025-04669-8_20260518_095316 Paper: cr_10.1038_s41390-025-04669-8	Is this neonate feeling pain? Leveraging clinical knowledge towards high-precision Large Language Model-based neonatal p... Fallback synthesis: Is this neonate feeling pain? Leveraging clinical knowledge towards high-precision Large Language Model-based neonatal pain assessment. Key signals: ssm, rag.	05-18 09:53	Success	-	View
exp_oa_W4415312651_20260518_095258 Paper: oa_W4415312651	Adaptive Accompaniment with ReaLchords Fallback synthesis: Adaptive Accompaniment with ReaLchords. Key signals: rag.	05-18 09:53	Success	-	View
exp_oa_W4415056742_20260518_095239 Paper: oa_W4415056742	Probabilistic Modeling of Latent Agentic Substructures in Deep Neural Networks Fallback synthesis: Probabilistic Modeling of Latent Agentic Substructures in Deep Neural Networks. Key signals: linear, grounded.	05-18 09:52	Success	-	View
exp_oa_W4414098962_20260518_095221 Paper: oa_W4414098962	ForestGPT and Beyond: A Trustworthy Domain-Specific Large Language Model Paving the Way to Forestry 5.0 Fallback synthesis: ForestGPT and Beyond: A Trustworthy Domain-Specific Large Language Model Paving the Way to Forestry 5.0. Key signals: retrieval, rag.	05-18 09:52	Success	-	View
exp_2506.12634v1_20260518_095203 Paper: 2506.12634v1	Between Predictability and Randomness: Seeking Artistic Inspiration from AI Generative Models Fallback synthesis: Between Predictability and Randomness: Seeking Artistic Inspiration from AI Generative Models. Key signals: memory, rag.	05-18 09:52	Success	-	View
exp_2506.22454v1_20260518_095145 Paper: 2506.22454v1	Microelectrode Signal Dynamics as Biomarkers of Subthalamic Nucleus Entry on Deep Brain Stimulation: A Nonlinear Feature... Fallback synthesis: Microelectrode Signal Dynamics as Biomarkers of Subthalamic Nucleus Entry on Deep Brain Stimulation: A Nonlinear Feature Approach. Key signals: linear, rag.	05-18 09:51	Success	-	View
exp_pytrain.20260518095042.002_20260518_095043 Paper: pytrain.20260518095042.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 09:50	Success	-	View
exp_cr_10.3390_tropicalmed10060167_20260518_095024 Paper: cr_10.3390_tropicalmed10060167	The Application of Machine Learning Algorithms to Predict HIV Testing Using Evidence from the 2002–2017 South African Ad... Fallback synthesis: The Application of Machine Learning Algorithms to Predict HIV Testing Using Evidence from the 2002–2017 South African Adult Population-Based Surveys: An HIV Testing Predictive Model. Key signals: ssm, rag.	05-18 09:50	Success	-	View
exp_oa_W4404344173_20260518_095005 Paper: oa_W4404344173	Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies Fallback synthesis: Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies. Key signals: memory, rag.	05-18 09:50	Success	-	View
exp_2412.12358v1_20260518_094947 Paper: 2412.12358v1	BioRAGent: A Retrieval-Augmented Generation System for Showcasing Generative Query Expansion and Domain-Specific Search... Fallback synthesis: BioRAGent: A Retrieval-Augmented Generation System for Showcasing Generative Query Expansion and Domain-Specific Search for Scientific Q&A. Key signals: retrieval, rag.	05-18 09:49	Success	-	View
exp_2406.13808v3_20260518_094928 Paper: 2406.13808v3	Can Low-Rank Knowledge Distillation in LLMs be Useful for Microelectronic Reasoning? Fallback synthesis: Can Low-Rank Knowledge Distillation in LLMs be Useful for Microelectronic Reasoning?. Key signals: context.	05-18 09:49	Success	-	View
exp_2406.13840v1_20260518_094910 Paper: 2406.13840v1	StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation Fallback synthesis: StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation. Key signals: retrieval, rag.	05-18 09:49	Success	-	View
exp_2309.13429v1_20260518_094852 Paper: 2309.13429v1	Modeling Student Performance in Game-Based Learning Environments Fallback synthesis: Modeling Student Performance in Game-Based Learning Environments. Key signals: context, rag.	05-18 09:48	Success	-	View
exp_2309.13464v1_20260518_094834 Paper: 2309.13464v1	Personalised and Adjustable Interval Type-2 Fuzzy-Based PPG Quality Assessment for the Edge Fallback synthesis: Personalised and Adjustable Interval Type-2 Fuzzy-Based PPG Quality Assessment for the Edge. Key signals: ssm, rag.	05-18 09:48	Success	-	View
exp_2309.13500v3_20260518_094816 Paper: 2309.13500v3	Enhancing Student Performance Prediction on Learnersourced Questions with SGNN-LLM Synergy Fallback synthesis: Enhancing Student Performance Prediction on Learnersourced Questions with SGNN-LLM Synergy. Key signals: sparse, embedding.	05-18 09:48	Success	-	View
exp_2209.14338v2_20260518_094758 Paper: 2209.14338v2	Who is GPT-3? An Exploration of Personality, Values and Demographics Fallback synthesis: Who is GPT-3? An Exploration of Personality, Values and Demographics. Key signals: ssm, memory.	05-18 09:48	Success	-	View
exp_cr_10.1093_humrep_deac107.551_20260518_094740 Paper: cr_10.1093_humrep_deac107.551	P-599 An expected benefit analysis of using an interpretable machine learning model for optimizing the day of trigger du... Fallback synthesis: P-599 An expected benefit analysis of using an interpretable machine learning model for optimizing the day of trigger during ovarian stimulation. Key signals: linear, rag.	05-18 09:47	Success	-	View
exp_cr_10.3390_biology11070995_20260518_094722 Paper: cr_10.3390_biology11070995	Predicting Protein–Protein Interactions Based on Ensemble Learning-Based Model from Protein Sequence Fallback synthesis: Predicting Protein–Protein Interactions Based on Ensemble Learning-Based Model from Protein Sequence. Key signals: ssm, rag.	05-18 09:47	Success	-	View
exp_2204.09640v3_20260518_094635 Paper: 2204.09640v3	Probabilistic AutoRegressive Neural Networks for Accurate Long-range Forecasting Fallback synthesis: Probabilistic AutoRegressive Neural Networks for Accurate Long-range Forecasting. Key signals: linear, rag.	05-18 09:46	Success	-	View
exp_2204.00703v5_20260518_094616 Paper: 2204.00703v5	A Reinforcement Learning Approach to Sensing Design in Resource-Constrained Wireless Networked Control Systems Fallback synthesis: A Reinforcement Learning Approach to Sensing Design in Resource-Constrained Wireless Networked Control Systems. Key signals: rag.	05-18 09:46	Success	-	View
exp_core_305590553_20260518_094558 Paper: core_305590553	Grounded Language Learning with Foundation Models Fallback synthesis: Grounded Language Learning with Foundation Models. Key signals: grounded.	05-18 09:46	Success	-	View
exp_2512.11141v2_20260518_094540 Paper: 2512.11141v2	Learning complete and explainable visual representations from itemized text supervision Fallback synthesis: Learning complete and explainable visual representations from itemized text supervision. Key signals: rag, grounded, embedding.	05-18 09:45	Success	-	View
exp_cr_10.31449_inf.v49i24.8395_20260518_094522 Paper: cr_10.31449_inf.v49i24.8395	Hybrid Deep Learning Model for Multi-Source Remote Sensing Data Fusion: Integrating DenseNet and Swin Transformer for Sp... Fallback synthesis: Hybrid Deep Learning Model for Multi-Source Remote Sensing Data Fusion: Integrating DenseNet and Swin Transformer for Spatial Alignment and Feature Extraction. Key signals: context, inference.	05-18 09:45	Success	-	View
exp_2506.12600v1_20260518_094504 Paper: 2506.12600v1	Trust-MARL: Trust-Based Multi-Agent Reinforcement Learning Framework for Cooperative On-Ramp Merging Control in Heteroge... Fallback synthesis: Trust-MARL: Trust-Based Multi-Agent Reinforcement Learning Framework for Cooperative On-Ramp Merging Control in Heterogeneous Traffic Flow. Key signals: context, rag.	05-18 09:45	Success	-	View
exp_2506.12607v1_20260518_094446 Paper: 2506.12607v1	Towards Building General Purpose Embedding Models for Industry 4.0 Agents Fallback synthesis: Towards Building General Purpose Embedding Models for Industry 4.0 Agents. Key signals: context, inference, rag, embedding.	05-18 09:44	Success	-	View
exp_2412.19823v1_20260518_094428 Paper: 2412.19823v1	A Survey on Large Language Models for Communication, Network, and Service Management: Application Insights, Challenges,... Fallback synthesis: A Survey on Large Language Models for Communication, Network, and Service Management: Application Insights, Challenges, and Future Directions. Key signals: context, rag.	05-18 09:44	Success	-	View
exp_2309.13430v1_20260518_094410 Paper: 2309.13430v1	Resolving References in Visually-Grounded Dialogue via Text Generation Fallback synthesis: Resolving References in Visually-Grounded Dialogue via Text Generation. Key signals: context, retrieval, rag, grounded.	05-18 09:44	Success	-	View
exp_2303.15555v1_20260518_094352 Paper: 2303.15555v1	Object Discovery from Motion-Guided Tokens Fallback synthesis: Object Discovery from Motion-Guided Tokens. Key signals: quantization, memory, rag.	05-18 09:43	Success	-	View
exp_2209.14434v1_20260518_094334 Paper: 2209.14434v1	Efficient Medical Image Assessment via Self-supervised Learning Fallback synthesis: Efficient Medical Image Assessment via Self-supervised Learning. Key signals: ssm, rag, embedding.	05-18 09:43	Success	-	View
exp_gh_Nestallum_tech-news-rag-assistant_20260518_094316 Paper: gh_Nestallum_tech-news-rag-assistant	Nestallum/tech-news-rag-assistant Fallback synthesis: Nestallum/tech-news-rag-assistant. Key signals: retrieval, rag, embedding.	05-18 09:43	Success	-	View
exp_2512.11074v1_20260518_094228 Paper: 2512.11074v1	MultiScript30k: Leveraging Multilingual Embeddings to Extend Cross Script Parallel Data Fallback synthesis: MultiScript30k: Leveraging Multilingual Embeddings to Extend Cross Script Parallel Data. Key signals: ssm, rag, embedding.	05-18 09:42	Success	-	View
exp_2512.11087v1_20260518_094209 Paper: 2512.11087v1	Clip-and-Verify: Linear Constraint-Driven Domain Clipping for Accelerating Neural Network Verification Fallback synthesis: Clip-and-Verify: Linear Constraint-Driven Domain Clipping for Accelerating Neural Network Verification. Key signals: linear, context, rag.	05-18 09:42	Success	-	View
exp_2512.11131v1_20260518_094151 Paper: 2512.11131v1	Fairness-Regularized Online Optimization with Switching Costs Fallback synthesis: Fairness-Regularized Online Optimization with Switching Costs. Key signals: linear, inference, rag.	05-18 09:41	Success	-	View
exp_oa_W4413800076_20260518_094133 Paper: oa_W4413800076	From Illusion to Insight: A Taxonomic Survey of Hallucination Mitigation Techniques in LLMs Fallback synthesis: From Illusion to Insight: A Taxonomic Survey of Hallucination Mitigation Techniques in LLMs. Key signals: retrieval, grounded.	05-18 09:41	Success	-	View
exp_2506.12597v1_20260518_094115 Paper: 2506.12597v1	Automatic Expert Discovery in LLM Upcycling via Sparse Interpolated Mixture-of-Experts Fallback synthesis: Automatic Expert Discovery in LLM Upcycling via Sparse Interpolated Mixture-of-Experts. Key signals: sparse, moe, rag.	05-18 09:41	Success	-	View
exp_2506.12655v2_20260518_094056 Paper: 2506.12655v2	Beyond Sin-Squared Error: Linear-Time Entrywise Uncertainty Quantification for Streaming PCA Fallback synthesis: Beyond Sin-Squared Error: Linear-Time Entrywise Uncertainty Quantification for Streaming PCA. Key signals: linear, inference, rag.	05-18 09:40	Success	-	View
exp_2412.12300v3_20260518_094038 Paper: 2412.12300v3	Unanswerability Evaluation for Retrieval Augmented Generation Fallback synthesis: Unanswerability Evaluation for Retrieval Augmented Generation. Key signals: retrieval, rag, rerank.	05-18 09:40	Success	-	View
exp_2412.12322v1_20260518_094019 Paper: 2412.12322v1	RAG Playground: A Framework for Systematic Evaluation of Retrieval Strategies and Prompt Engineering in RAG Systems Fallback synthesis: RAG Playground: A Framework for Systematic Evaluation of Retrieval Strategies and Prompt Engineering in RAG Systems. Key signals: retrieval, rag, rerank.	05-18 09:40	Success	-	View
exp_2412.12359v2_20260518_094002 Paper: 2412.12359v2	LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering Fallback synthesis: LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering. Key signals: linear, context, rag.	05-18 09:40	Success	-	View
exp_oa_W4399837987_20260518_093943 Paper: oa_W4399837987	Supporting Human Raters with the Detection of Harmful Content using Large Language Models Fallback synthesis: Supporting Human Raters with the Detection of Harmful Content using Large Language Models. Key signals: ssm, context, rag.	05-18 09:39	Success	-	View
exp_2406.13805v1_20260518_093925 Paper: 2406.13805v1	WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia Fallback synthesis: WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia. Key signals: context, retrieval, rag.	05-18 09:39	Success	-	View
exp_2406.13851v1_20260518_093907 Paper: 2406.13851v1	Optimizing Quantile-based Trading Strategies in Electricity Arbitrage Fallback synthesis: Optimizing Quantile-based Trading Strategies in Electricity Arbitrage. Key signals: ssm, rag.	05-18 09:39	Success	-	View
exp_cr_10.3389_feduc.2024.1355952_20260518_093819 Paper: cr_10.3389_feduc.2024.1355952	Applying the MSMLP model in advancing language teaching and learning: a longitudinal case study on soft skills developme... Fallback synthesis: Applying the MSMLP model in advancing language teaching and learning: a longitudinal case study on soft skills development. Key signals: ssm, context, rag.	05-18 09:38	Success	-	View
exp_cr_10.31849_utamax.v5i1.11260_20260518_093801 Paper: cr_10.31849_utamax.v5i1.11260	From Speech to Text: Enhancing Descriptive Paragraph Writing with Unjuk Tutur‘s Learning Model Fallback synthesis: From Speech to Text: Enhancing Descriptive Paragraph Writing with Unjuk Tutur‘s Learning Model. Key signals: ssm, context, rag.	05-18 09:38	Success	-	View
exp_2204.00595v1_20260518_093743 Paper: 2204.00595v1	Monarch: Expressive Structured Matrices for Efficient and Accurate Training Fallback synthesis: Monarch: Expressive Structured Matrices for Efficient and Accurate Training. Key signals: sparse, memory.	05-18 09:37	Success	-	View
exp_2303.15446v2_20260518_093725 Paper: 2303.15446v2	SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications Fallback synthesis: SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications. Key signals: linear, context, inference.	05-18 09:37	Success	-	View
exp_oa_W7118543654_20260518_093707 Paper: oa_W7118543654	Instruction Tuning for Large Language Models: RLHF, Supervised Fine-Tuning, and Alignment Strategies Fallback synthesis: Instruction Tuning for Large Language Models: RLHF, Supervised Fine-Tuning, and Alignment Strategies. No strong keyword signals detected.	05-18 09:37	Success	-	View
exp_2512.11061v1_20260518_093649 Paper: 2512.11061v1	VDAWorld: World Modelling via VLM-Directed Abstraction and Simulation Fallback synthesis: VDAWorld: World Modelling via VLM-Directed Abstraction and Simulation. Key signals: grounded.	05-18 09:36	Success	-	View
exp_oa_W4417539773_20260518_093630 Paper: oa_W4417539773	Towards AI Search Paradigm Fallback synthesis: Towards AI Search Paradigm. Key signals: inference, retrieval.	05-18 09:36	Success	-	View
exp_2412.15262v1_20260518_093613 Paper: 2412.15262v1	Advanced ingestion process powered by LLM parsing for RAG system Fallback synthesis: Advanced ingestion process powered by LLM parsing for RAG system. Key signals: context, retrieval, rag, embedding.	05-18 09:36	Success	-	View
exp_2412.12364v1_20260518_093554 Paper: 2412.12364v1	LogBabylon: A Unified Framework for Cross-Log File Integration and Analysis Fallback synthesis: LogBabylon: A Unified Framework for Cross-Log File Integration and Analysis. Key signals: context, retrieval, rag.	05-18 09:35	Success	-	View
exp_2512.11130v2_20260518_093537 Paper: 2512.11130v2	Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching Fallback synthesis: Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching. No strong keyword signals detected.	05-18 09:35	Success	-	View
exp_2506.12647v1_20260518_093518 Paper: 2506.12647v1	Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas Fallback synthesis: Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas. Key signals: linear, memory, rag.	05-18 09:35	Success	-	View
exp_oa_W7130510261_20260518_093501 Paper: oa_W7130510261	Training Methods for Large Language Models: Current Approaches and Challenges Fallback synthesis: Training Methods for Large Language Models: Current Approaches and Challenges. Key signals: sparse, moe, retrieval.	05-18 09:35	Success	-	View
exp_2303.15553v3_20260518_093412 Paper: 2303.15553v3	MoViT: Memorizing Vision Transformers for Medical Image Analysis Fallback synthesis: MoViT: Memorizing Vision Transformers for Medical Image Analysis. Key signals: context, memory, inference, rag.	05-18 09:34	Success	-	View
exp_2204.00716v2_20260518_093354 Paper: 2204.00716v2	CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos Fallback synthesis: CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos. Key signals: ssm, retrieval, embedding.	05-18 09:33	Success	-	View
exp_2406.13868v1_20260518_093336 Paper: 2406.13868v1	SDQ: Sparse Decomposed Quantization for LLM Inference Fallback synthesis: SDQ: Sparse Decomposed Quantization for LLM Inference. Key signals: quantization, sparse, memory, inference.	05-18 09:33	Success	-	View
exp_core_160824652_20260518_093318 Paper: core_160824652	Efficient and Scalable Large Multimodal Models Fallback synthesis: Efficient and Scalable Large Multimodal Models. Key signals: quantization, moe, memory, inference.	05-18 09:33	Success	-	View
exp_cr_10.71465_csb162_20260518_093300 Paper: cr_10.71465_csb162	Domain-Adapted Large Language Models for Industrial Applications: From Fine-Tuning to Real-Time Deployment Fallback synthesis: Domain-Adapted Large Language Models for Industrial Applications: From Fine-Tuning to Real-Time Deployment. Key signals: context, inference, retrieval, rag.	05-18 09:33	Success	-	View
exp_pytrain.20260518092010.001_20260518_092010 Paper: pytrain.20260518092010.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 09:20	Success	-	View
exp_pytrain.20260518091904.006_20260518_091905 Paper: pytrain.20260518091904.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 09:19	Success	-	View
exp_self.20260518091638.023_20260518_091638 Paper: self.20260518091638.023	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518091638.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 09:16	Success	-	View
exp_self.20260518091003.022_20260518_091004 Paper: self.20260518091003.022	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518091003.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 09:10	Success	-	View
exp_self.20260518090327.021_20260518_090328 Paper: self.20260518090327.021	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518090327.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 09:03	Success	-	View
exp_self.20260518085645.020_20260518_085645 Paper: self.20260518085645.020	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518085645.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 08:56	Success	-	View
exp_self.20260518085009.019_20260518_085009 Paper: self.20260518085009.019	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518085009.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 08:50	Success	-	View
exp_pytrain.20260518084836.005_20260518_084837 Paper: pytrain.20260518084836.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 08:48	Success	-	View
exp_self.20260518084228.018_20260518_084228 Paper: self.20260518084228.018	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518084228.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 08:42	Success	-	View
exp_self.20260518083549.017_20260518_083550 Paper: self.20260518083549.017	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518083549.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 08:35	Success	-	View
exp_self.20260518082913.016_20260518_082914 Paper: self.20260518082913.016	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518082913.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 08:29	Success	-	View
exp_self.20260518082228.015_20260518_082229 Paper: self.20260518082228.015	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518082228.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 08:22	Success	-	View
exp_cr_10.3390_rs18101619_20260518_081902 Paper: cr_10.3390_rs18101619	Comprehensive Analysis of Snow BRDF Variations by Assessing the Improved Kernel-Driven BRDF Model Paper ID: cr_10.3390_rs18101619 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered be...	05-18 08:19	Success	-	View
exp_pytrain.20260518081648.004_20260518_081648 Paper: pytrain.20260518081648.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 08:16	Success	-	View
exp_self.20260518081544.014_20260518_081545 Paper: self.20260518081544.014	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518081544.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 08:15	Success	-	View
exp_self.20260518080900.013_20260518_080901 Paper: self.20260518080900.013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518080900.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 08:09	Success	-	View
exp_self.20260518080219.012_20260518_080219 Paper: self.20260518080219.012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518080219.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 08:02	Success	-	View
exp_self.20260518075542.011_20260518_075542 Paper: self.20260518075542.011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518075542.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 07:55	Success	-	View
exp_self.20260518074905.010_20260518_074905 Paper: self.20260518074905.010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518074905.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 07:49	Success	-	View
exp_pytrain.20260518074621.003_20260518_074621 Paper: pytrain.20260518074621.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 07:46	Success	-	View
exp_hf_2605.15592_20260518_074351 Paper: hf_2605.15592	Efficient Image Synthesis with Sphere Latent Encoder Paper ID: hf_2605.15592 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 07:43	Success	-	View
exp_self.20260518074244.009_20260518_074244 Paper: self.20260518074244.009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518074244.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 07:42	Success	-	View
exp_self.20260518073606.008_20260518_073606 Paper: self.20260518073606.008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518073606.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 07:36	Success	-	View
exp_self.20260518072926.007_20260518_072926 Paper: self.20260518072926.007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518072926.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 07:29	Success	-	View
exp_self.20260518072246.006_20260518_072247 Paper: self.20260518072246.006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518072246.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 07:22	Success	-	View
exp_self.20260518071640.005_20260518_071640 Paper: self.20260518071640.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518071640.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 07:16	Success	-	View
exp_pytrain.20260518071506.002_20260518_071506 Paper: pytrain.20260518071506.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 07:15	Success	-	View
exp_self.20260518071041.004_20260518_071041 Paper: self.20260518071041.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518071041.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 07:10	Success	-	View
exp_oa_W4362515116_20260518_070842 Paper: oa_W4362515116	A Survey of Large Language Models Paper ID: oa_W4362515116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 07:08	Success	-	View
exp_hf_2605.12058_20260518_070538 Paper: hf_2605.12058	Hölder Policy Optimisation Paper ID: hf_2605.12058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 07:05	Success	-	View
exp_self.20260518070318.003_20260518_070319 Paper: self.20260518070318.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518070318.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 07:03	Success	-	View
exp_hf_2605.15375_20260518_065839 Paper: hf_2605.15375	ChangeFlow -- Latent Rectified Flow for Change Detection in Remote Sensing Paper ID: hf_2605.15375 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 06:58	Success	-	View
exp_self.20260518065732.002_20260518_065733 Paper: self.20260518065732.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518065732.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 06:57	Success	-	View
exp_oa_W7160968741_20260518_065509 Paper: oa_W7160968741	Star Elastic: Many-in-One Reasoning LLMs with Efficient Budget Control Paper ID: oa_W7160968741 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 06:55	Success	-	View
exp_hf_2605.15250_20260518_065235 Paper: hf_2605.15250	GQLA: Group-Query Latent Attention for Hardware-Adaptive Large Language Model Decoding Paper ID: hf_2605.15250 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-18 06:52	Success	-	View
exp_self.20260518065123.001_20260518_065123 Paper: self.20260518065123.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260518065123.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-18 06:51	Success	-	View
exp_cr_10.54097_yhppk428_20260518_064926 Paper: cr_10.54097_yhppk428	Distributed Training Strategies for Reducing Carbon Footprint in Large Scale Model Development Paper ID: cr_10.54097_yhppk428 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered ben...	05-18 06:49	Success	-	View
exp_2605.16255v1_20260518_064732 Paper: 2605.16255v1	Designing Datacenter Power Delivery Hierarchies for the AI Era Paper ID: 2605.16255v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-18 06:47	Success	-	View
exp_pytrain.20260518064350.001_20260518_064350 Paper: pytrain.20260518064350.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-18 06:43	Success	-	View
exp_pytrain.20260510093059.001_20260510_093059 Paper: pytrain.20260510093059.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 09:31	Success	-	View
exp_gh_echo313unfolding_helix-substrate_20260510_092941 Paper: gh_echo313unfolding_helix-substrate	echo313unfolding/helix-substrate Paper ID: gh_echo313unfolding_helix-substrate - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal...	05-10 09:29	Success	-	View
exp_self.20260510092616.003_20260510_092617 Paper: self.20260510092616.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510092616.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 09:26	Success	-	View
exp_gh_Priyanka-techi_rag-qa-chatbot_20260510_092250 Paper: gh_Priyanka-techi_rag-qa-chatbot	Priyanka-techi/rag-qa-chatbot Paper ID: gh_Priyanka-techi_rag-qa-chatbot - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: R...	05-10 09:22	Success	-	View
exp_self.20260510092032.002_20260510_092033 Paper: self.20260510092032.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510092032.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 09:20	Success	-	View
exp_self.20260510091415.001_20260510_091415 Paper: self.20260510091415.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510091415.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 09:14	Success	-	View
exp_pytrain.20260510091242.001_20260510_091242 Paper: pytrain.20260510091242.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 09:12	Success	-	View
exp_self.20260510085803.013_20260510_085804 Paper: self.20260510085803.013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510085803.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:58	Success	-	View
exp_self.20260510085131.012_20260510_085132 Paper: self.20260510085131.012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510085131.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:51	Success	-	View
exp_self.20260510084455.011_20260510_084456 Paper: self.20260510084455.011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510084455.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:44	Success	-	View
exp_pytrain.20260510084107.003_20260510_084108 Paper: pytrain.20260510084107.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 08:41	Success	-	View
exp_self.20260510083857.010_20260510_083858 Paper: self.20260510083857.010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510083857.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:39	Success	-	View
exp_self.20260510083220.009_20260510_083220 Paper: self.20260510083220.009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510083220.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:32	Success	-	View
exp_self.20260510082546.008_20260510_082547 Paper: self.20260510082546.008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510082546.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:25	Success	-	View
exp_self.20260510081913.007_20260510_081913 Paper: self.20260510081913.007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510081913.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:19	Success	-	View
exp_self.20260510081240.006_20260510_081241 Paper: self.20260510081240.006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510081240.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:12	Success	-	View
exp_pytrain.20260510080959.002_20260510_080959 Paper: pytrain.20260510080959.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 08:10	Success	-	View
exp_self.20260510080638.005_20260510_080639 Paper: self.20260510080638.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510080638.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:06	Success	-	View
exp_self.20260510080008.004_20260510_080008 Paper: self.20260510080008.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510080008.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 08:00	Success	-	View
exp_self.20260510075333.003_20260510_075333 Paper: self.20260510075333.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510075333.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 07:53	Success	-	View
exp_self.20260510074658.002_20260510_074659 Paper: self.20260510074658.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510074658.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 07:47	Success	-	View
exp_self.20260510074025.001_20260510_074026 Paper: self.20260510074025.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510074025.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 07:40	Success	-	View
exp_pytrain.20260510073854.001_20260510_073854 Paper: pytrain.20260510073854.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 07:38	Success	-	View
exp_pytrain.20260510073537.002_20260510_073537 Paper: pytrain.20260510073537.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 07:35	Success	-	View
exp_self.20260510073328.005_20260510_073328 Paper: self.20260510073328.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510073328.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 07:33	Success	-	View
exp_self.20260510072653.004_20260510_072653 Paper: self.20260510072653.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510072653.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 07:26	Success	-	View
exp_self.20260510072005.003_20260510_072006 Paper: self.20260510072005.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510072005.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 07:20	Success	-	View
exp_self.20260510071332.002_20260510_071332 Paper: self.20260510071332.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510071332.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 07:13	Success	-	View
exp_cr_10.1007_s44163-026-01360-7_20260510_071117 Paper: cr_10.1007_s44163-026-01360-7	World model inspired sarcasm reasoning with large language model agents Paper ID: cr_10.1007_s44163-026-01360-7 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	05-10 07:11	Success	-	View
exp_self.20260510070645.001_20260510_070646 Paper: self.20260510070645.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510070645.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 07:06	Success	-	View
exp_pytrain.20260510070514.001_20260510_070514 Paper: pytrain.20260510070514.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 07:05	Success	-	View
exp_self.20260510070202.002_20260510_070202 Paper: self.20260510070202.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510070202.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 07:02	Success	-	View
exp_self.20260510065531.001_20260510_065531 Paper: self.20260510065531.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510065531.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 06:55	Success	-	View
exp_pytrain.20260510065400.001_20260510_065400 Paper: pytrain.20260510065400.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 06:54	Success	-	View
exp_self.20260510064650.003_20260510_064650 Paper: self.20260510064650.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510064650.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 06:46	Success	-	View
exp_self.20260510064019.002_20260510_064020 Paper: self.20260510064019.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510064019.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 06:40	Success	-	View
exp_self.20260510063349.001_20260510_063349 Paper: self.20260510063349.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510063349.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 06:33	Success	-	View
exp_pytrain.20260510063218.001_20260510_063218 Paper: pytrain.20260510063218.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 06:32	Success	-	View
exp_self.20260510062803.001_20260510_062804 Paper: self.20260510062803.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510062803.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 06:28	Success	-	View
exp_pytrain.20260510062632.001_20260510_062633 Paper: pytrain.20260510062632.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 06:26	Success	-	View
exp_self.20260510061415.192_20260510_061416 Paper: self.20260510061415.192	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510061415.192 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 06:14	Success	-	View
exp_self.20260510060738.191_20260510_060738 Paper: self.20260510060738.191	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510060738.191 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 06:07	Success	-	View
exp_pytrain.20260510060348.041_20260510_060348 Paper: pytrain.20260510060348.041	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 06:03	Success	-	View
exp_self.20260510060135.190_20260510_060135 Paper: self.20260510060135.190	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510060135.190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 06:01	Success	-	View
exp_self.20260510055502.189_20260510_055503 Paper: self.20260510055502.189	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510055502.189 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 05:55	Success	-	View
exp_self.20260510054830.188_20260510_054831 Paper: self.20260510054830.188	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510054830.188 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 05:48	Success	-	View
exp_self.20260510054159.187_20260510_054159 Paper: self.20260510054159.187	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510054159.187 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 05:42	Success	-	View
exp_self.20260510053522.186_20260510_053523 Paper: self.20260510053522.186	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510053522.186 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 05:35	Success	-	View
exp_pytrain.20260510053242.040_20260510_053243 Paper: pytrain.20260510053242.040	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 05:32	Success	-	View
exp_self.20260510052922.185_20260510_052923 Paper: self.20260510052922.185	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510052922.185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 05:29	Success	-	View
exp_self.20260510052246.184_20260510_052247 Paper: self.20260510052246.184	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510052246.184 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 05:22	Success	-	View
exp_self.20260510051612.183_20260510_051612 Paper: self.20260510051612.183	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510051612.183 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 05:16	Success	-	View
exp_self.20260510050937.182_20260510_050937 Paper: self.20260510050937.182	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510050937.182 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 05:09	Success	-	View
exp_self.20260510050305.181_20260510_050305 Paper: self.20260510050305.181	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510050305.181 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 05:03	Success	-	View
exp_pytrain.20260510050129.039_20260510_050130 Paper: pytrain.20260510050129.039	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 05:01	Success	-	View
exp_self.20260510045537.180_20260510_045537 Paper: self.20260510045537.180	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510045537.180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 04:55	Success	-	View
exp_self.20260510044859.179_20260510_044859 Paper: self.20260510044859.179	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510044859.179 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 04:49	Success	-	View
exp_self.20260510044226.178_20260510_044227 Paper: self.20260510044226.178	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510044226.178 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 04:42	Success	-	View
exp_self.20260510043555.177_20260510_043555 Paper: self.20260510043555.177	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510043555.177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 04:35	Success	-	View
exp_pytrain.20260510043033.038_20260510_043034 Paper: pytrain.20260510043033.038	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 04:30	Success	-	View
exp_self.20260510042928.176_20260510_042929 Paper: self.20260510042928.176	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510042928.176 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 04:29	Success	-	View
exp_self.20260510042256.175_20260510_042257 Paper: self.20260510042256.175	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510042256.175 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 04:22	Success	-	View
exp_self.20260510041624.174_20260510_041624 Paper: self.20260510041624.174	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510041624.174 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 04:16	Success	-	View
exp_self.20260510040943.173_20260510_040943 Paper: self.20260510040943.173	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510040943.173 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 04:09	Success	-	View
exp_self.20260510040304.172_20260510_040305 Paper: self.20260510040304.172	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510040304.172 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 04:03	Success	-	View
exp_pytrain.20260510035917.037_20260510_035917 Paper: pytrain.20260510035917.037	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 03:59	Success	-	View
exp_self.20260510035705.171_20260510_035705 Paper: self.20260510035705.171	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510035705.171 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 03:57	Success	-	View
exp_self.20260510035028.170_20260510_035028 Paper: self.20260510035028.170	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510035028.170 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 03:50	Success	-	View
exp_self.20260510034356.169_20260510_034357 Paper: self.20260510034356.169	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510034356.169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 03:43	Success	-	View
exp_self.20260510033718.168_20260510_033718 Paper: self.20260510033718.168	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510033718.168 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 03:37	Success	-	View
exp_self.20260510033046.167_20260510_033046 Paper: self.20260510033046.167	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510033046.167 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 03:30	Success	-	View
exp_pytrain.20260510032805.036_20260510_032806 Paper: pytrain.20260510032805.036	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 03:28	Success	-	View
exp_self.20260510032446.166_20260510_032447 Paper: self.20260510032446.166	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510032446.166 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 03:24	Success	-	View
exp_self.20260510031815.165_20260510_031816 Paper: self.20260510031815.165	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510031815.165 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 03:18	Success	-	View
exp_self.20260510031140.164_20260510_031140 Paper: self.20260510031140.164	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510031140.164 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 03:11	Success	-	View
exp_self.20260510030506.163_20260510_030506 Paper: self.20260510030506.163	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510030506.163 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 03:05	Success	-	View
exp_self.20260510025834.162_20260510_025835 Paper: self.20260510025834.162	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510025834.162 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 02:58	Success	-	View
exp_pytrain.20260510025659.035_20260510_025700 Paper: pytrain.20260510025659.035	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 02:57	Success	-	View
exp_self.20260510025105.161_20260510_025106 Paper: self.20260510025105.161	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510025105.161 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 02:51	Success	-	View
exp_self.20260510024430.160_20260510_024430 Paper: self.20260510024430.160	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510024430.160 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 02:44	Success	-	View
exp_self.20260510023758.159_20260510_023758 Paper: self.20260510023758.159	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510023758.159 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 02:38	Success	-	View
exp_self.20260510023128.158_20260510_023128 Paper: self.20260510023128.158	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510023128.158 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 02:31	Success	-	View
exp_pytrain.20260510022607.034_20260510_022607 Paper: pytrain.20260510022607.034	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 02:26	Success	-	View
exp_self.20260510022504.157_20260510_022504 Paper: self.20260510022504.157	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510022504.157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 02:25	Success	-	View
exp_self.20260510021831.156_20260510_021832 Paper: self.20260510021831.156	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510021831.156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 02:18	Success	-	View
exp_self.20260510021200.155_20260510_021200 Paper: self.20260510021200.155	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510021200.155 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 02:12	Success	-	View
exp_self.20260510020453.154_20260510_020453 Paper: self.20260510020453.154	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510020453.154 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 02:04	Success	-	View
exp_self.20260510015734.153_20260510_015735 Paper: self.20260510015734.153	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510015734.153 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 01:57	Success	-	View
exp_pytrain.20260510015452.033_20260510_015452 Paper: pytrain.20260510015452.033	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 01:54	Success	-	View
exp_self.20260510015133.152_20260510_015134 Paper: self.20260510015133.152	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510015133.152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 01:51	Success	-	View
exp_self.20260510014500.151_20260510_014500 Paper: self.20260510014500.151	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510014500.151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 01:45	Success	-	View
exp_self.20260510013828.150_20260510_013829 Paper: self.20260510013828.150	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510013828.150 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 01:38	Success	-	View
exp_self.20260510013154.149_20260510_013154 Paper: self.20260510013154.149	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510013154.149 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 01:31	Success	-	View
exp_self.20260510012523.148_20260510_012524 Paper: self.20260510012523.148	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510012523.148 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 01:25	Success	-	View
exp_pytrain.20260510012353.032_20260510_012354 Paper: pytrain.20260510012353.032	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 01:23	Success	-	View
exp_self.20260510011745.147_20260510_011746 Paper: self.20260510011745.147	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510011745.147 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 01:17	Success	-	View
exp_self.20260510011113.146_20260510_011114 Paper: self.20260510011113.146	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510011113.146 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 01:11	Success	-	View
exp_self.20260510010436.145_20260510_010436 Paper: self.20260510010436.145	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510010436.145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 01:04	Success	-	View
exp_self.20260510005732.144_20260510_005732 Paper: self.20260510005732.144	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510005732.144 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 00:57	Success	-	View
exp_pytrain.20260510005212.031_20260510_005212 Paper: pytrain.20260510005212.031	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 00:52	Success	-	View
exp_self.20260510005109.143_20260510_005109 Paper: self.20260510005109.143	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510005109.143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 00:51	Success	-	View
exp_self.20260510004437.142_20260510_004438 Paper: self.20260510004437.142	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510004437.142 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 00:44	Success	-	View
exp_self.20260510003807.141_20260510_003808 Paper: self.20260510003807.141	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510003807.141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 00:38	Success	-	View
exp_self.20260510003135.140_20260510_003135 Paper: self.20260510003135.140	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510003135.140 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 00:31	Success	-	View
exp_self.20260510002504.139_20260510_002505 Paper: self.20260510002504.139	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510002504.139 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 00:25	Success	-	View
exp_pytrain.20260510002114.030_20260510_002114 Paper: pytrain.20260510002114.030	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-10 00:21	Success	-	View
exp_self.20260510001906.138_20260510_001907 Paper: self.20260510001906.138	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510001906.138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 00:19	Success	-	View
exp_self.20260510001235.137_20260510_001235 Paper: self.20260510001235.137	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510001235.137 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 00:12	Success	-	View
exp_self.20260510000558.136_20260510_000559 Paper: self.20260510000558.136	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260510000558.136 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-10 00:06	Success	-	View
exp_self.20260509235924.135_20260509_235924 Paper: self.20260509235924.135	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509235924.135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:59	Success	-	View
exp_self.20260509235253.134_20260509_235254 Paper: self.20260509235253.134	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509235253.134 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:52	Success	-	View
exp_pytrain.20260509235012.029_20260509_235013 Paper: pytrain.20260509235012.029	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 23:50	Success	-	View
exp_self.20260509234653.133_20260509_234653 Paper: self.20260509234653.133	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509234653.133 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:46	Success	-	View
exp_self.20260509234024.132_20260509_234024 Paper: self.20260509234024.132	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509234024.132 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:40	Success	-	View
exp_self.20260509233353.131_20260509_233354 Paper: self.20260509233353.131	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509233353.131 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:33	Success	-	View
exp_self.20260509232718.130_20260509_232719 Paper: self.20260509232718.130	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509232718.130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:27	Success	-	View
exp_self.20260509232042.129_20260509_232043 Paper: self.20260509232042.129	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509232042.129 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:20	Success	-	View
exp_pytrain.20260509231912.028_20260509_231912 Paper: pytrain.20260509231912.028	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 23:19	Success	-	View
exp_self.20260509231309.128_20260509_231310 Paper: self.20260509231309.128	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509231309.128 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:13	Success	-	View
exp_self.20260509230637.127_20260509_230637 Paper: self.20260509230637.127	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509230637.127 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:06	Success	-	View
exp_self.20260509230005.126_20260509_230005 Paper: self.20260509230005.126	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509230005.126 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 23:00	Success	-	View
exp_self.20260509225320.125_20260509_225320 Paper: self.20260509225320.125	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509225320.125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 22:53	Success	-	View
exp_pytrain.20260509224800.027_20260509_224800 Paper: pytrain.20260509224800.027	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 22:48	Success	-	View
exp_self.20260509224656.124_20260509_224657 Paper: self.20260509224656.124	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509224656.124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 22:46	Success	-	View
exp_self.20260509224021.123_20260509_224022 Paper: self.20260509224021.123	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509224021.123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 22:40	Success	-	View
exp_self.20260509223351.122_20260509_223351 Paper: self.20260509223351.122	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509223351.122 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 22:33	Success	-	View
exp_self.20260509222720.121_20260509_222720 Paper: self.20260509222720.121	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509222720.121 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 22:27	Success	-	View
exp_self.20260509222047.120_20260509_222047 Paper: self.20260509222047.120	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509222047.120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 22:20	Success	-	View
exp_pytrain.20260509221656.026_20260509_221656 Paper: pytrain.20260509221656.026	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 22:16	Success	-	View
exp_self.20260509221446.119_20260509_221447 Paper: self.20260509221446.119	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509221446.119 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 22:14	Success	-	View
exp_self.20260509220816.118_20260509_220816 Paper: self.20260509220816.118	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509220816.118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 22:08	Success	-	View
exp_self.20260509220141.117_20260509_220141 Paper: self.20260509220141.117	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509220141.117 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 22:01	Success	-	View
exp_self.20260509215502.116_20260509_215502 Paper: self.20260509215502.116	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509215502.116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 21:55	Success	-	View
exp_self.20260509214827.115_20260509_214827 Paper: self.20260509214827.115	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509214827.115 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 21:48	Success	-	View
exp_pytrain.20260509214546.025_20260509_214546 Paper: pytrain.20260509214546.025	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 21:45	Success	-	View
exp_self.20260509214227.114_20260509_214227 Paper: self.20260509214227.114	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509214227.114 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 21:42	Success	-	View
exp_self.20260509213554.113_20260509_213555 Paper: self.20260509213554.113	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509213554.113 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 21:35	Success	-	View
exp_self.20260509212919.112_20260509_212919 Paper: self.20260509212919.112	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509212919.112 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 21:29	Success	-	View
exp_self.20260509212245.111_20260509_212246 Paper: self.20260509212245.111	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509212245.111 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 21:22	Success	-	View
exp_self.20260509211610.110_20260509_211611 Paper: self.20260509211610.110	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509211610.110 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 21:16	Success	-	View
exp_pytrain.20260509211439.024_20260509_211439 Paper: pytrain.20260509211439.024	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 21:14	Success	-	View
exp_self.20260509210836.109_20260509_210836 Paper: self.20260509210836.109	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509210836.109 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 21:08	Success	-	View
exp_self.20260509210157.108_20260509_210158 Paper: self.20260509210157.108	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509210157.108 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 21:02	Success	-	View
exp_self.20260509205529.107_20260509_205529 Paper: self.20260509205529.107	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509205529.107 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 20:55	Success	-	View
exp_self.20260509204845.106_20260509_204846 Paper: self.20260509204845.106	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509204845.106 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 20:48	Success	-	View
exp_pytrain.20260509204307.023_20260509_204308 Paper: pytrain.20260509204307.023	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 20:43	Success	-	View
exp_self.20260509204205.105_20260509_204205 Paper: self.20260509204205.105	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509204205.105 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 20:42	Success	-	View
exp_self.20260509203524.104_20260509_203524 Paper: self.20260509203524.104	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509203524.104 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 20:35	Success	-	View
exp_self.20260509202852.103_20260509_202852 Paper: self.20260509202852.103	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509202852.103 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 20:28	Success	-	View
exp_self.20260509202223.102_20260509_202223 Paper: self.20260509202223.102	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509202223.102 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 20:22	Success	-	View
exp_self.20260509201552.101_20260509_201552 Paper: self.20260509201552.101	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509201552.101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 20:15	Success	-	View
exp_pytrain.20260509201200.022_20260509_201201 Paper: pytrain.20260509201200.022	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 20:12	Success	-	View
exp_self.20260509200952.100_20260509_200952 Paper: self.20260509200952.100	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509200952.100 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 20:09	Success	-	View
exp_self.20260509200320.099_20260509_200321 Paper: self.20260509200320.099	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509200320.099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 20:03	Success	-	View
exp_self.20260509195650.098_20260509_195651 Paper: self.20260509195650.098	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509195650.098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 19:56	Success	-	View
exp_self.20260509195014.097_20260509_195015 Paper: self.20260509195014.097	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509195014.097 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 19:50	Success	-	View
exp_self.20260509194341.096_20260509_194342 Paper: self.20260509194341.096	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509194341.096 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 19:43	Success	-	View
exp_pytrain.20260509194101.021_20260509_194101 Paper: pytrain.20260509194101.021	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 19:41	Success	-	View
exp_self.20260509193744.095_20260509_193744 Paper: self.20260509193744.095	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509193744.095 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 19:37	Success	-	View
exp_self.20260509193106.094_20260509_193107 Paper: self.20260509193106.094	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509193106.094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 19:31	Success	-	View
exp_self.20260509192430.093_20260509_192431 Paper: self.20260509192430.093	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509192430.093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 19:24	Success	-	View
exp_self.20260509191757.092_20260509_191757 Paper: self.20260509191757.092	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509191757.092 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 19:17	Success	-	View
exp_self.20260509191122.091_20260509_191122 Paper: self.20260509191122.091	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509191122.091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 19:11	Success	-	View
exp_pytrain.20260509190950.020_20260509_190950 Paper: pytrain.20260509190950.020	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 19:09	Success	-	View
exp_self.20260509190343.090_20260509_190344 Paper: self.20260509190343.090	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509190343.090 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 19:03	Success	-	View
exp_self.20260509185706.089_20260509_185706 Paper: self.20260509185706.089	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509185706.089 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 18:57	Success	-	View
exp_self.20260509185036.088_20260509_185037 Paper: self.20260509185036.088	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509185036.088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 18:50	Success	-	View
exp_self.20260509184358.087_20260509_184359 Paper: self.20260509184358.087	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509184358.087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 18:44	Success	-	View
exp_pytrain.20260509183839.019_20260509_183840 Paper: pytrain.20260509183839.019	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 18:38	Success	-	View
exp_self.20260509183736.086_20260509_183736 Paper: self.20260509183736.086	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509183736.086 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 18:37	Success	-	View
exp_self.20260509183105.085_20260509_183105 Paper: self.20260509183105.085	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509183105.085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 18:31	Success	-	View
exp_self.20260509182431.084_20260509_182431 Paper: self.20260509182431.084	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509182431.084 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 18:24	Success	-	View
exp_self.20260509181754.083_20260509_181754 Paper: self.20260509181754.083	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509181754.083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 18:17	Success	-	View
exp_self.20260509181123.082_20260509_181123 Paper: self.20260509181123.082	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509181123.082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 18:11	Success	-	View
exp_pytrain.20260509180738.018_20260509_180739 Paper: pytrain.20260509180738.018	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 18:07	Success	-	View
exp_self.20260509180523.081_20260509_180524 Paper: self.20260509180523.081	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509180523.081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 18:05	Success	-	View
exp_self.20260509175852.080_20260509_175852 Paper: self.20260509175852.080	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509175852.080 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 17:58	Success	-	View
exp_self.20260509175224.079_20260509_175224 Paper: self.20260509175224.079	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509175224.079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 17:52	Success	-	View
exp_self.20260509174550.078_20260509_174550 Paper: self.20260509174550.078	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509174550.078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 17:45	Success	-	View
exp_self.20260509173907.077_20260509_173907 Paper: self.20260509173907.077	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509173907.077 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 17:39	Success	-	View
exp_pytrain.20260509173626.017_20260509_173627 Paper: pytrain.20260509173626.017	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 17:36	Success	-	View
exp_self.20260509173310.076_20260509_173310 Paper: self.20260509173310.076	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509173310.076 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 17:33	Success	-	View
exp_self.20260509172637.075_20260509_172637 Paper: self.20260509172637.075	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509172637.075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 17:26	Success	-	View
exp_self.20260509172007.074_20260509_172007 Paper: self.20260509172007.074	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509172007.074 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 17:20	Success	-	View
exp_self.20260509171336.073_20260509_171336 Paper: self.20260509171336.073	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509171336.073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 17:13	Success	-	View
exp_self.20260509170700.072_20260509_170700 Paper: self.20260509170700.072	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509170700.072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 17:07	Success	-	View
exp_pytrain.20260509170523.016_20260509_170523 Paper: pytrain.20260509170523.016	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 17:05	Success	-	View
exp_self.20260509165921.071_20260509_165921 Paper: self.20260509165921.071	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509165921.071 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:59	Success	-	View
exp_self.20260509165253.070_20260509_165253 Paper: self.20260509165253.070	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509165253.070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:52	Success	-	View
exp_self.20260509164624.069_20260509_164625 Paper: self.20260509164624.069	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509164624.069 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:46	Success	-	View
exp_self.20260509163954.068_20260509_163955 Paper: self.20260509163954.068	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509163954.068 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:39	Success	-	View
exp_pytrain.20260509163500.015_20260509_163500 Paper: pytrain.20260509163500.015	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 16:35	Success	-	View
exp_self.20260509163357.067_20260509_163357 Paper: self.20260509163357.067	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509163357.067 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:34	Success	-	View
exp_self.20260509162728.066_20260509_162728 Paper: self.20260509162728.066	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509162728.066 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:27	Success	-	View
exp_self.20260509162054.065_20260509_162055 Paper: self.20260509162054.065	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509162054.065 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:20	Success	-	View
exp_self.20260509161418.064_20260509_161419 Paper: self.20260509161418.064	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509161418.064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:14	Success	-	View
exp_self.20260509160749.063_20260509_160750 Paper: self.20260509160749.063	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509160749.063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:07	Success	-	View
exp_pytrain.20260509160406.014_20260509_160406 Paper: pytrain.20260509160406.014	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 16:04	Success	-	View
exp_self.20260509160146.062_20260509_160146 Paper: self.20260509160146.062	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509160146.062 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 16:01	Success	-	View
exp_self.20260509155516.061_20260509_155517 Paper: self.20260509155516.061	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509155516.061 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:55	Success	-	View
exp_self.20260509154848.060_20260509_154849 Paper: self.20260509154848.060	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509154848.060 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:48	Success	-	View
exp_self.20260509154214.059_20260509_154215 Paper: self.20260509154214.059	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509154214.059 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:42	Success	-	View
exp_self.20260509153540.058_20260509_153540 Paper: self.20260509153540.058	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509153540.058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:35	Success	-	View
exp_pytrain.20260509153300.013_20260509_153301 Paper: pytrain.20260509153300.013	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 15:33	Success	-	View
exp_self.20260509152945.057_20260509_152945 Paper: self.20260509152945.057	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509152945.057 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:29	Success	-	View
exp_self.20260509152309.056_20260509_152309 Paper: self.20260509152309.056	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509152309.056 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:23	Success	-	View
exp_cr_10.1093_mnras_stag893_20260509_151945 Paper: cr_10.1093_mnras_stag893	AstroSpec-LLM: A Large Language Model Framework for High-throughput Infrared Spectral Prediction of Interstellar PAHs Paper ID: cr_10.1093_mnras_stag893 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:19	Success	-	View
exp_self.20260509151623.055_20260509_151623 Paper: self.20260509151623.055	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509151623.055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:16	Success	-	View
exp_self.20260509150954.054_20260509_150955 Paper: self.20260509150954.054	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509150954.054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:09	Success	-	View
exp_self.20260509150327.053_20260509_150327 Paper: self.20260509150327.053	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509150327.053 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 15:03	Success	-	View
exp_pytrain.20260509150150.012_20260509_150150 Paper: pytrain.20260509150150.012	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 15:01	Success	-	View
exp_self.20260509145551.052_20260509_145551 Paper: self.20260509145551.052	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509145551.052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 14:55	Success	-	View
exp_self.20260509144922.051_20260509_144922 Paper: self.20260509144922.051	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509144922.051 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 14:49	Success	-	View
exp_self.20260509144254.050_20260509_144254 Paper: self.20260509144254.050	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509144254.050 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 14:42	Success	-	View
exp_self.20260509143626.049_20260509_143626 Paper: self.20260509143626.049	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509143626.049 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 14:36	Success	-	View
exp_pytrain.20260509143105.011_20260509_143105 Paper: pytrain.20260509143105.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 14:31	Success	-	View
exp_self.20260509143003.048_20260509_143003 Paper: self.20260509143003.048	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509143003.048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 14:30	Success	-	View
exp_self.20260509142335.047_20260509_142335 Paper: self.20260509142335.047	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509142335.047 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 14:23	Success	-	View
exp_self.20260509141706.046_20260509_141707 Paper: self.20260509141706.046	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509141706.046 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 14:17	Success	-	View
exp_self.20260509141031.045_20260509_141031 Paper: self.20260509141031.045	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509141031.045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 14:10	Success	-	View
exp_self.20260509140402.044_20260509_140403 Paper: self.20260509140402.044	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509140402.044 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 14:04	Success	-	View
exp_pytrain.20260509140017.010_20260509_140018 Paper: pytrain.20260509140017.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 14:00	Success	-	View
exp_self.20260509135806.043_20260509_135806 Paper: self.20260509135806.043	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509135806.043 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 13:58	Success	-	View
exp_self.20260509135136.042_20260509_135136 Paper: self.20260509135136.042	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509135136.042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 13:51	Success	-	View
exp_self.20260509134507.041_20260509_134507 Paper: self.20260509134507.041	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509134507.041 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 13:45	Success	-	View
exp_self.20260509133833.040_20260509_133833 Paper: self.20260509133833.040	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509133833.040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 13:38	Success	-	View
exp_self.20260509133153.039_20260509_133153 Paper: self.20260509133153.039	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509133153.039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 13:31	Success	-	View
exp_pytrain.20260509132912.009_20260509_132912 Paper: pytrain.20260509132912.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 13:29	Success	-	View
exp_self.20260509132553.038_20260509_132553 Paper: self.20260509132553.038	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509132553.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 13:25	Success	-	View
exp_self.20260509131915.037_20260509_131915 Paper: self.20260509131915.037	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509131915.037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 13:19	Success	-	View
exp_self.20260509131240.036_20260509_131240 Paper: self.20260509131240.036	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509131240.036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 13:12	Success	-	View
exp_self.20260509130608.035_20260509_130608 Paper: self.20260509130608.035	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509130608.035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 13:06	Success	-	View
exp_self.20260509125937.034_20260509_125937 Paper: self.20260509125937.034	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509125937.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 12:59	Success	-	View
exp_pytrain.20260509125801.008_20260509_125801 Paper: pytrain.20260509125801.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 12:58	Success	-	View
exp_self.20260509125210.033_20260509_125210 Paper: self.20260509125210.033	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509125210.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 12:52	Success	-	View
exp_self.20260509124536.032_20260509_124536 Paper: self.20260509124536.032	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509124536.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 12:45	Success	-	View
exp_self.20260509123903.031_20260509_123903 Paper: self.20260509123903.031	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509123903.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 12:39	Success	-	View
exp_self.20260509123233.030_20260509_123233 Paper: self.20260509123233.030	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509123233.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 12:32	Success	-	View
exp_pytrain.20260509122711.007_20260509_122712 Paper: pytrain.20260509122711.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 12:27	Success	-	View
exp_self.20260509122608.029_20260509_122608 Paper: self.20260509122608.029	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509122608.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 12:26	Success	-	View
exp_self.20260509121940.028_20260509_121941 Paper: self.20260509121940.028	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509121940.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 12:19	Success	-	View
exp_cr_10.1093_bioinformatics_btag260_20260509_121746 Paper: cr_10.1093_bioinformatics_btag260	Protein Language Model Embeddings Improve HIV Drug Resistance Prediction: A Comprehensive Benchmark with Attention-Based... Paper ID: cr_10.1093_bioinformatics_btag260 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal:...	05-09 12:17	Success	-	View
exp_self.20260509121141.027_20260509_121141 Paper: self.20260509121141.027	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509121141.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 12:11	Success	-	View
exp_self.20260509120511.026_20260509_120511 Paper: self.20260509120511.026	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509120511.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 12:05	Success	-	View
exp_self.20260509115841.025_20260509_115842 Paper: self.20260509115841.025	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509115841.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:58	Success	-	View
exp_pytrain.20260509115601.006_20260509_115601 Paper: pytrain.20260509115601.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 11:56	Success	-	View
exp_self.20260509115241.024_20260509_115242 Paper: self.20260509115241.024	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509115241.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:52	Success	-	View
exp_self.20260509114612.023_20260509_114612 Paper: self.20260509114612.023	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509114612.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:46	Success	-	View
exp_self.20260509114028.022_20260509_114028 Paper: self.20260509114028.022	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509114028.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:40	Success	-	View
exp_self.20260509113400.021_20260509_113401 Paper: self.20260509113400.021	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509113400.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:34	Success	-	View
exp_self.20260509112733.020_20260509_112733 Paper: self.20260509112733.020	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509112733.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:27	Success	-	View
exp_pytrain.20260509112452.005_20260509_112453 Paper: pytrain.20260509112452.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 11:24	Success	-	View
exp_self.20260509112134.019_20260509_112135 Paper: self.20260509112134.019	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509112134.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:21	Success	-	View
exp_self.20260509111506.018_20260509_111506 Paper: self.20260509111506.018	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509111506.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:15	Success	-	View
exp_self.20260509110829.017_20260509_110830 Paper: self.20260509110829.017	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509110829.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:08	Success	-	View
exp_self.20260509110157.016_20260509_110158 Paper: self.20260509110157.016	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509110157.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 11:02	Success	-	View
exp_self.20260509105527.015_20260509_105528 Paper: self.20260509105527.015	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509105527.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 10:55	Success	-	View
exp_pytrain.20260509105352.004_20260509_105353 Paper: pytrain.20260509105352.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 10:53	Success	-	View
exp_self.20260509104800.014_20260509_104800 Paper: self.20260509104800.014	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509104800.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 10:48	Success	-	View
exp_self.20260509104125.013_20260509_104126 Paper: self.20260509104125.013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509104125.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 10:41	Success	-	View
exp_self.20260509103449.012_20260509_103450 Paper: self.20260509103449.012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509103449.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 10:34	Success	-	View
exp_self.20260509102814.011_20260509_102814 Paper: self.20260509102814.011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509102814.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 10:28	Success	-	View
exp_pytrain.20260509102249.003_20260509_102249 Paper: pytrain.20260509102249.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 10:22	Success	-	View
exp_self.20260509102142.010_20260509_102143 Paper: self.20260509102142.010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509102142.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 10:21	Success	-	View
exp_self.20260509101508.009_20260509_101509 Paper: self.20260509101508.009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509101508.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 10:15	Success	-	View
exp_self.20260509100821.008_20260509_100822 Paper: self.20260509100821.008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509100821.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 10:08	Success	-	View
exp_self.20260509100143.007_20260509_100143 Paper: self.20260509100143.007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509100143.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 10:01	Success	-	View
exp_self.20260509095503.006_20260509_095503 Paper: self.20260509095503.006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509095503.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 09:55	Success	-	View
exp_pytrain.20260509095114.002_20260509_095114 Paper: pytrain.20260509095114.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 09:51	Success	-	View
exp_self.20260509094903.005_20260509_094904 Paper: self.20260509094903.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509094903.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 09:49	Success	-	View
exp_self.20260509094221.004_20260509_094222 Paper: self.20260509094221.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509094221.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 09:42	Success	-	View
exp_self.20260509093551.003_20260509_093551 Paper: self.20260509093551.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509093551.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 09:35	Success	-	View
exp_self.20260509092838.002_20260509_092838 Paper: self.20260509092838.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509092838.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 09:28	Success	-	View
exp_self.20260509092206.001_20260509_092207 Paper: self.20260509092206.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509092206.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 09:22	Success	-	View
exp_pytrain.20260509092035.001_20260509_092035 Paper: pytrain.20260509092035.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 09:20	Success	-	View
exp_pytrain.20260509090930.001_20260509_090931 Paper: pytrain.20260509090930.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 09:10	Success	-	View
exp_self.20260509090017.001_20260509_090018 Paper: self.20260509090017.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509090017.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 09:01	Success	-	View
exp_pytrain.20260509085747.001_20260509_085747 Paper: pytrain.20260509085747.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 08:58	Success	-	View
exp_pytrain.20260509084551.001_20260509_084551 Paper: pytrain.20260509084551.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 08:46	Success	-	View
exp_self.20260509084242.003_20260509_084243 Paper: self.20260509084242.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509084242.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 08:43	Success	-	View
exp_self.20260509083508.002_20260509_083509 Paper: self.20260509083508.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509083508.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 08:36	Success	-	View
exp_self.20260509082736.001_20260509_082737 Paper: self.20260509082736.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509082736.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 08:28	Success	-	View
exp_pytrain.20260509082506.001_20260509_082506 Paper: pytrain.20260509082506.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 08:26	Success	-	View
exp_self.20260509082304.004_20260509_082305 Paper: self.20260509082304.004	self.20260509082304.004 No summary available yet.	05-09 08:23	Success	-	View
exp_self.20260509081631.003_20260509_081631 Paper: self.20260509081631.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509081631.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 08:16	Failed	GPU_REQUIRED policy blocked benchmark execution.	View
exp_self.20260509080957.002_20260509_080958 Paper: self.20260509080957.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509080957.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 08:09	Failed	GPU_REQUIRED policy blocked benchmark execution.	View
exp_self.20260509080324.001_20260509_080324 Paper: self.20260509080324.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509080324.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 08:03	Failed	GPU_REQUIRED policy blocked benchmark execution.	View
exp_pytrain.20260509080147.001_20260509_080148 Paper: pytrain.20260509080147.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 08:01	Failed	GPU_REQUIRED policy blocked benchmark execution.	View
exp_pytrain.20260509075902.001_20260509_075903 Paper: pytrain.20260509075902.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 07:59	Pending	-	View
exp_pytrain.20260509075611.001_20260509_075612 Paper: pytrain.20260509075611.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 07:56	Pending	-	View
exp_pytrain.20260509075053.001_20260509_075053 Paper: pytrain.20260509075053.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 07:50	Pending	-	View
exp_self.20260509074650.374_20260509_074650 Paper: self.20260509074650.374	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509074650.374 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 07:47	Success	-	View
exp_self.20260509073856.373_20260509_073857 Paper: self.20260509073856.373	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509073856.373 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 07:40	Success	-	View
exp_self.20260509073042.372_20260509_073042 Paper: self.20260509073042.372	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509073042.372 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 07:31	Success	-	View
exp_self.20260509072242.371_20260509_072243 Paper: self.20260509072242.371	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509072242.371 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 07:23	Success	-	View
exp_pytrain.20260509072006.092_20260509_072006 Paper: pytrain.20260509072006.092	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 07:21	Success	-	View
exp_self.20260509071426.370_20260509_071426 Paper: self.20260509071426.370	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509071426.370 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 07:15	Success	-	View
exp_self.20260509070644.369_20260509_070644 Paper: self.20260509070644.369	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509070644.369 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 07:07	Success	-	View
exp_self.20260509065858.368_20260509_065858 Paper: self.20260509065858.368	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509065858.368 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 07:00	Success	-	View
exp_self.20260509065116.367_20260509_065116 Paper: self.20260509065116.367	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509065116.367 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 06:52	Success	-	View
exp_pytrain.20260509064839.091_20260509_064839 Paper: pytrain.20260509064839.091	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 06:49	Success	-	View
exp_self.20260509064301.366_20260509_064301 Paper: self.20260509064301.366	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509064301.366 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 06:44	Success	-	View
exp_self.20260509063548.365_20260509_063548 Paper: self.20260509063548.365	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509063548.365 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 06:36	Success	-	View
exp_self.20260509062839.364_20260509_062839 Paper: self.20260509062839.364	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509062839.364 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 06:29	Success	-	View
exp_self.20260509062051.363_20260509_062051 Paper: self.20260509062051.363	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509062051.363 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 06:21	Success	-	View
exp_pytrain.20260509061655.090_20260509_061656 Paper: pytrain.20260509061655.090	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 06:17	Success	-	View
exp_self.20260509061333.362_20260509_061334 Paper: self.20260509061333.362	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509061333.362 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 06:14	Success	-	View
exp_self.20260509060550.361_20260509_060550 Paper: self.20260509060550.361	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509060550.361 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 06:06	Success	-	View
exp_self.20260509055810.360_20260509_055810 Paper: self.20260509055810.360	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509055810.360 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 05:59	Success	-	View
exp_self.20260509055030.359_20260509_055030 Paper: self.20260509055030.359	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509055030.359 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 05:51	Success	-	View
exp_pytrain.20260509054531.089_20260509_054531 Paper: pytrain.20260509054531.089	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 05:46	Success	-	View
exp_self.20260509054323.358_20260509_054324 Paper: self.20260509054323.358	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509054323.358 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 05:44	Success	-	View
exp_gh_naranor_wamp-proxy_20260509_054005 Paper: gh_naranor_wamp-proxy	naranor/wamp-proxy Paper ID: gh_naranor_wamp-proxy - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered be...	05-09 05:41	Success	-	View
exp_self.20260509053427.357_20260509_053427 Paper: self.20260509053427.357	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509053427.357 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 05:35	Success	-	View
exp_gh_jacksong-sourse_sll-core_20260509_053131 Paper: gh_jacksong-sourse_sll-core	jacksong-sourse/sll-core Paper ID: gh_jacksong-sourse_sll-core - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recove...	05-09 05:32	Success	-	View
exp_self.20260509052412.356_20260509_052412 Paper: self.20260509052412.356	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509052412.356 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 05:25	Success	-	View
exp_self.20260509051632.355_20260509_051632 Paper: self.20260509051632.355	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509051632.355 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 05:17	Success	-	View
exp_pytrain.20260509051353.088_20260509_051354 Paper: pytrain.20260509051353.088	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 05:14	Success	-	View
exp_self.20260509050813.354_20260509_050813 Paper: self.20260509050813.354	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509050813.354 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 05:09	Success	-	View
exp_self.20260509050032.353_20260509_050032 Paper: self.20260509050032.353	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509050032.353 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 05:01	Success	-	View
exp_self.20260509045250.352_20260509_045251 Paper: self.20260509045250.352	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509045250.352 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 04:53	Success	-	View
exp_self.20260509044508.351_20260509_044509 Paper: self.20260509044508.351	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509044508.351 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 04:46	Success	-	View
exp_pytrain.20260509044232.087_20260509_044233 Paper: pytrain.20260509044232.087	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 04:43	Success	-	View
exp_self.20260509043649.350_20260509_043649 Paper: self.20260509043649.350	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509043649.350 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 04:37	Success	-	View
exp_self.20260509042912.349_20260509_042912 Paper: self.20260509042912.349	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509042912.349 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 04:30	Success	-	View
exp_self.20260509042133.348_20260509_042134 Paper: self.20260509042133.348	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509042133.348 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 04:22	Success	-	View
exp_self.20260509041343.347_20260509_041344 Paper: self.20260509041343.347	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509041343.347 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 04:14	Success	-	View
exp_pytrain.20260509041108.086_20260509_041108 Paper: pytrain.20260509041108.086	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 04:12	Success	-	View
exp_self.20260509040538.346_20260509_040538 Paper: self.20260509040538.346	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509040538.346 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 04:06	Success	-	View
exp_self.20260509035750.345_20260509_035751 Paper: self.20260509035750.345	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509035750.345 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 03:58	Success	-	View
exp_self.20260509035008.344_20260509_035008 Paper: self.20260509035008.344	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509035008.344 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 03:51	Success	-	View
exp_self.20260509034228.343_20260509_034228 Paper: self.20260509034228.343	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509034228.343 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 03:43	Success	-	View
exp_pytrain.20260509033947.085_20260509_033947 Paper: pytrain.20260509033947.085	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 03:40	Success	-	View
exp_self.20260509033417.342_20260509_033418 Paper: self.20260509033417.342	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509033417.342 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 03:35	Success	-	View
exp_self.20260509032630.341_20260509_032631 Paper: self.20260509032630.341	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509032630.341 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 03:27	Success	-	View
exp_self.20260509031820.340_20260509_031820 Paper: self.20260509031820.340	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509031820.340 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 03:19	Success	-	View
exp_self.20260509031110.339_20260509_031111 Paper: self.20260509031110.339	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509031110.339 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 03:12	Success	-	View
exp_pytrain.20260509030754.084_20260509_030754 Paper: pytrain.20260509030754.084	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 03:08	Success	-	View
exp_self.20260509030034.338_20260509_030035 Paper: self.20260509030034.338	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509030034.338 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 03:01	Success	-	View
exp_self.20260509025327.337_20260509_025328 Paper: self.20260509025327.337	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509025327.337 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 02:54	Success	-	View
exp_self.20260509024558.336_20260509_024559 Paper: self.20260509024558.336	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509024558.336 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 02:47	Success	-	View
exp_self.20260509023848.335_20260509_023848 Paper: self.20260509023848.335	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509023848.335 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 02:39	Success	-	View
exp_pytrain.20260509023553.083_20260509_023554 Paper: pytrain.20260509023553.083	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 02:36	Success	-	View
exp_self.20260509022929.334_20260509_022929 Paper: self.20260509022929.334	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509022929.334 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 02:30	Success	-	View
exp_self.20260509022207.333_20260509_022208 Paper: self.20260509022207.333	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509022207.333 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 02:23	Success	-	View
exp_self.20260509021459.332_20260509_021459 Paper: self.20260509021459.332	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509021459.332 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 02:16	Success	-	View
exp_self.20260509020711.331_20260509_020711 Paper: self.20260509020711.331	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509020711.331 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 02:08	Success	-	View
exp_pytrain.20260509020427.082_20260509_020428 Paper: pytrain.20260509020427.082	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 02:05	Success	-	View
exp_self.20260509015605.330_20260509_015606 Paper: self.20260509015605.330	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509015605.330 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 01:57	Success	-	View
exp_self.20260509014911.329_20260509_014911 Paper: self.20260509014911.329	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509014911.329 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 01:50	Success	-	View
exp_self.20260509014138.328_20260509_014138 Paper: self.20260509014138.328	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509014138.328 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 01:42	Success	-	View
exp_self.20260509013444.327_20260509_013445 Paper: self.20260509013444.327	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509013444.327 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 01:35	Success	-	View
exp_pytrain.20260509013154.081_20260509_013154 Paper: pytrain.20260509013154.081	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 01:32	Success	-	View
exp_self.20260509012530.326_20260509_012530 Paper: self.20260509012530.326	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509012530.326 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 01:26	Success	-	View
exp_self.20260509011835.325_20260509_011836 Paper: self.20260509011835.325	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509011835.325 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 01:19	Success	-	View
exp_self.20260509011141.324_20260509_011141 Paper: self.20260509011141.324	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509011141.324 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 01:12	Success	-	View
exp_self.20260509010448.323_20260509_010448 Paper: self.20260509010448.323	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509010448.323 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 01:05	Success	-	View
exp_pytrain.20260509005902.080_20260509_005902 Paper: pytrain.20260509005902.080	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 01:00	Success	-	View
exp_self.20260509005644.322_20260509_005645 Paper: self.20260509005644.322	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509005644.322 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 00:57	Success	-	View
exp_self.20260509004945.321_20260509_004946 Paper: self.20260509004945.321	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509004945.321 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 00:50	Success	-	View
exp_self.20260509004242.320_20260509_004242 Paper: self.20260509004242.320	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509004242.320 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 00:43	Success	-	View
exp_self.20260509003532.319_20260509_003532 Paper: self.20260509003532.319	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509003532.319 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 00:36	Success	-	View
exp_self.20260509002821.318_20260509_002822 Paper: self.20260509002821.318	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509002821.318 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 00:29	Success	-	View
exp_pytrain.20260509002528.079_20260509_002528 Paper: pytrain.20260509002528.079	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-09 00:26	Success	-	View
exp_self.20260509001851.317_20260509_001851 Paper: self.20260509001851.317	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509001851.317 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 00:19	Success	-	View
exp_cr_10.62762_dia.2026.309098_20260509_001519 Paper: cr_10.62762_dia.2026.309098	Farming Upward: The TsingSky Guangzhou Future Agriculture Cluster as a County-Level Model for Context-Specific Smart Agr... Paper ID: cr_10.62762_dia.2026.309098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recove...	05-09 00:16	Success	-	View
exp_self.20260509001152.316_20260509_001152 Paper: self.20260509001152.316	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509001152.316 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 00:12	Success	-	View
exp_self.20260509000431.315_20260509_000431 Paper: self.20260509000431.315	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260509000431.315 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-09 00:05	Success	-	View
exp_self.20260508235652.314_20260508_235652 Paper: self.20260508235652.314	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508235652.314 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 23:57	Success	-	View
exp_pytrain.20260508235406.078_20260508_235407 Paper: pytrain.20260508235406.078	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 23:55	Success	-	View
exp_self.20260508234916.313_20260508_234916 Paper: self.20260508234916.313	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508234916.313 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 23:50	Success	-	View
exp_self.20260508234208.312_20260508_234208 Paper: self.20260508234208.312	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508234208.312 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 23:43	Success	-	View
exp_self.20260508233440.311_20260508_233441 Paper: self.20260508233440.311	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508233440.311 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 23:35	Success	-	View
exp_self.20260508232736.310_20260508_232737 Paper: self.20260508232736.310	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508232736.310 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 23:28	Success	-	View
exp_pytrain.20260508232208.077_20260508_232209 Paper: pytrain.20260508232208.077	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 23:23	Success	-	View
exp_self.20260508231942.309_20260508_231952 Paper: self.20260508231942.309	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508231942.309 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 23:20	Success	-	View
exp_self.20260508231224.308_20260508_231224 Paper: self.20260508231224.308	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508231224.308 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 23:13	Success	-	View
exp_self.20260508230515.307_20260508_230516 Paper: self.20260508230515.307	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508230515.307 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 23:06	Success	-	View
exp_self.20260508225803.306_20260508_225803 Paper: self.20260508225803.306	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508225803.306 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 22:59	Success	-	View
exp_self.20260508225108.305_20260508_225108 Paper: self.20260508225108.305	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508225108.305 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 22:52	Success	-	View
exp_pytrain.20260508224815.076_20260508_224815 Paper: pytrain.20260508224815.076	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 22:49	Success	-	View
exp_self.20260508224044.304_20260508_224045 Paper: self.20260508224044.304	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508224044.304 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 22:41	Success	-	View
exp_self.20260508223350.303_20260508_223350 Paper: self.20260508223350.303	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508223350.303 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 22:34	Success	-	View
exp_self.20260508222643.302_20260508_222653 Paper: self.20260508222643.302	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508222643.302 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 22:27	Success	-	View
exp_self.20260508221921.301_20260508_221921 Paper: self.20260508221921.301	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508221921.301 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 22:20	Success	-	View
exp_pytrain.20260508221627.075_20260508_221627 Paper: pytrain.20260508221627.075	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 22:17	Success	-	View
exp_self.20260508221137.300_20260508_221137 Paper: self.20260508221137.300	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508221137.300 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 22:12	Success	-	View
exp_self.20260508220436.299_20260508_220436 Paper: self.20260508220436.299	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508220436.299 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 22:05	Success	-	View
exp_self.20260508215723.298_20260508_215724 Paper: self.20260508215723.298	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508215723.298 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 21:58	Success	-	View
exp_self.20260508215007.297_20260508_215008 Paper: self.20260508215007.297	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508215007.297 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 21:51	Success	-	View
exp_pytrain.20260508214446.074_20260508_214447 Paper: pytrain.20260508214446.074	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 21:45	Success	-	View
exp_self.20260508214229.296_20260508_214230 Paper: self.20260508214229.296	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508214229.296 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 21:43	Success	-	View
exp_self.20260508213518.295_20260508_213518 Paper: self.20260508213518.295	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508213518.295 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 21:36	Success	-	View
exp_self.20260508212757.294_20260508_212758 Paper: self.20260508212757.294	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508212757.294 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 21:29	Success	-	View
exp_self.20260508212040.293_20260508_212040 Paper: self.20260508212040.293	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508212040.293 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 21:21	Success	-	View
exp_self.20260508211347.292_20260508_211347 Paper: self.20260508211347.292	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508211347.292 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 21:14	Success	-	View
exp_pytrain.20260508211100.073_20260508_211100 Paper: pytrain.20260508211100.073	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 21:12	Success	-	View
exp_self.20260508210410.291_20260508_210411 Paper: self.20260508210410.291	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508210410.291 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 21:05	Success	-	View
exp_self.20260508205706.290_20260508_205706 Paper: self.20260508205706.290	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508205706.290 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 20:58	Success	-	View
exp_self.20260508204954.289_20260508_204955 Paper: self.20260508204954.289	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508204954.289 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 20:50	Success	-	View
exp_self.20260508204210.288_20260508_204211 Paper: self.20260508204210.288	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508204210.288 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 20:43	Success	-	View
exp_pytrain.20260508203924.072_20260508_203924 Paper: pytrain.20260508203924.072	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 20:40	Success	-	View
exp_self.20260508203213.287_20260508_203214 Paper: self.20260508203213.287	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508203213.287 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 20:33	Success	-	View
exp_self.20260508202457.286_20260508_202458 Paper: self.20260508202457.286	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508202457.286 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 20:26	Success	-	View
exp_self.20260508201738.285_20260508_201739 Paper: self.20260508201738.285	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508201738.285 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 20:18	Success	-	View
exp_self.20260508201048.284_20260508_201048 Paper: self.20260508201048.284	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508201048.284 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 20:11	Success	-	View
exp_pytrain.20260508200747.071_20260508_200758 Paper: pytrain.20260508200747.071	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 20:09	Success	-	View
exp_self.20260508200311.283_20260508_200312 Paper: self.20260508200311.283	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508200311.283 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 20:04	Success	-	View
exp_self.20260508195555.282_20260508_195555 Paper: self.20260508195555.282	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508195555.282 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 19:56	Success	-	View
exp_self.20260508194829.281_20260508_194829 Paper: self.20260508194829.281	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508194829.281 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 19:49	Success	-	View
exp_self.20260508194136.280_20260508_194136 Paper: self.20260508194136.280	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508194136.280 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 19:42	Success	-	View
exp_pytrain.20260508193626.070_20260508_193626 Paper: pytrain.20260508193626.070	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 19:37	Success	-	View
exp_self.20260508193410.279_20260508_193410 Paper: self.20260508193410.279	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508193410.279 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 19:35	Success	-	View
exp_self.20260508192712.278_20260508_192712 Paper: self.20260508192712.278	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508192712.278 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 19:28	Success	-	View
exp_cr_10.1371_journal.pone.0346078_20260508_192235 Paper: cr_10.1371_journal.pone.0346078	Systematic evaluation of the DeepSeek large language model for clinical diagnostic reasoning Paper ID: cr_10.1371_journal.pone.0346078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Re...	05-08 19:23	Success	-	View
exp_self.20260508192014.277_20260508_192014 Paper: self.20260508192014.277	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508192014.277 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 19:21	Success	-	View
exp_gh_IbadKhalid7_turboquant-model_20260508_191646 Paper: gh_IbadKhalid7_turboquant-model	IbadKhalid7/turboquant-model Paper ID: gh_IbadKhalid7_turboquant-model - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Re...	05-08 19:17	Success	-	View
exp_self.20260508191258.276_20260508_191258 Paper: self.20260508191258.276	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508191258.276 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 19:14	Success	-	View
exp_self.20260508190543.275_20260508_190543 Paper: self.20260508190543.275	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508190543.275 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 19:06	Success	-	View
exp_pytrain.20260508190254.069_20260508_190255 Paper: pytrain.20260508190254.069	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 19:03	Success	-	View
exp_self.20260508185635.274_20260508_185635 Paper: self.20260508185635.274	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508185635.274 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 18:57	Success	-	View
exp_hf_2605.06663_20260508_185138 Paper: hf_2605.06663	EMO: Pretraining Mixture of Experts for Emergent Modularity Paper ID: hf_2605.06663 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-08 18:52	Success	-	View
exp_self.20260508184917.273_20260508_184917 Paper: self.20260508184917.273	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508184917.273 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 18:50	Success	-	View
exp_self.20260508184104.272_20260508_184104 Paper: self.20260508184104.272	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508184104.272 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 18:42	Success	-	View
exp_self.20260508183344.271_20260508_183344 Paper: self.20260508183344.271	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508183344.271 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 18:34	Success	-	View
exp_pytrain.20260508183042.068_20260508_183042 Paper: pytrain.20260508183042.068	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 18:31	Success	-	View
exp_self.20260508182417.270_20260508_182418 Paper: self.20260508182417.270	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508182417.270 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 18:25	Success	-	View
exp_self.20260508181703.269_20260508_181703 Paper: self.20260508181703.269	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508181703.269 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 18:18	Success	-	View
exp_self.20260508180955.268_20260508_180955 Paper: self.20260508180955.268	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508180955.268 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 18:10	Success	-	View
exp_self.20260508180141.267_20260508_180141 Paper: self.20260508180141.267	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508180141.267 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 18:02	Success	-	View
exp_pytrain.20260508175840.067_20260508_175841 Paper: pytrain.20260508175840.067	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 17:59	Success	-	View
exp_self.20260508175409.266_20260508_175410 Paper: self.20260508175409.266	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508175409.266 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 17:55	Success	-	View
exp_self.20260508174643.265_20260508_174652 Paper: self.20260508174643.265	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508174643.265 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 17:47	Success	-	View
exp_self.20260508173907.264_20260508_173907 Paper: self.20260508173907.264	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508173907.264 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 17:40	Success	-	View
exp_self.20260508173149.263_20260508_173149 Paper: self.20260508173149.263	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508173149.263 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 17:32	Success	-	View
exp_pytrain.20260508172636.066_20260508_172637 Paper: pytrain.20260508172636.066	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 17:27	Success	-	View
exp_self.20260508172417.262_20260508_172418 Paper: self.20260508172417.262	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508172417.262 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 17:25	Success	-	View
exp_self.20260508171707.261_20260508_171707 Paper: self.20260508171707.261	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508171707.261 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 17:18	Success	-	View
exp_cr_10.3390_educsci16050747_20260508_171334 Paper: cr_10.3390_educsci16050747	The CO-SPACE Model: Developing an Analytical Framework for Interdisciplinary Student Collaboration Paper ID: cr_10.3390_educsci16050747 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recover...	05-08 17:14	Success	-	View
exp_self.20260508171007.260_20260508_171007 Paper: self.20260508171007.260	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508171007.260 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 17:11	Success	-	View
exp_self.20260508170319.259_20260508_170319 Paper: self.20260508170319.259	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508170319.259 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 17:04	Success	-	View
exp_self.20260508165625.258_20260508_165625 Paper: self.20260508165625.258	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508165625.258 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 16:57	Success	-	View
exp_pytrain.20260508165340.065_20260508_165341 Paper: pytrain.20260508165340.065	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 16:54	Success	-	View
exp_self.20260508164709.257_20260508_164710 Paper: self.20260508164709.257	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508164709.257 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 16:48	Success	-	View
exp_self.20260508163936.256_20260508_163937 Paper: self.20260508163936.256	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508163936.256 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 16:40	Success	-	View
exp_self.20260508163221.255_20260508_163222 Paper: self.20260508163221.255	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508163221.255 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 16:33	Success	-	View
exp_self.20260508162404.254_20260508_162404 Paper: self.20260508162404.254	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508162404.254 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 16:25	Success	-	View
exp_pytrain.20260508162119.064_20260508_162120 Paper: pytrain.20260508162119.064	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 16:22	Success	-	View
exp_cr_10.3390_systems14050529_20260508_161714 Paper: cr_10.3390_systems14050529	An Interpretable Socio-Technical Decision Support System for Bi-Objective Urban Distribution Center Location: Adaptive O... Paper ID: cr_10.3390_systems14050529 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recover...	05-08 16:18	Success	-	View
exp_self.20260508161446.253_20260508_161446 Paper: self.20260508161446.253	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508161446.253 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 16:15	Success	-	View
exp_self.20260508160647.252_20260508_160647 Paper: self.20260508160647.252	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508160647.252 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 16:07	Success	-	View
exp_self.20260508155942.251_20260508_155951 Paper: self.20260508155942.251	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508155942.251 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 16:00	Success	-	View
exp_self.20260508155138.250_20260508_155139 Paper: self.20260508155138.250	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508155138.250 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 15:52	Success	-	View
exp_pytrain.20260508154842.063_20260508_154842 Paper: pytrain.20260508154842.063	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 15:49	Success	-	View
exp_self.20260508154416.249_20260508_154416 Paper: self.20260508154416.249	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508154416.249 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 15:45	Success	-	View
exp_self.20260508153607.248_20260508_153608 Paper: self.20260508153607.248	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508153607.248 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 15:37	Success	-	View
exp_self.20260508152816.247_20260508_152816 Paper: self.20260508152816.247	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508152816.247 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 15:29	Success	-	View
exp_self.20260508151958.246_20260508_151959 Paper: self.20260508151958.246	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508151958.246 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 15:21	Success	-	View
exp_pytrain.20260508151620.062_20260508_151620 Paper: pytrain.20260508151620.062	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 15:17	Success	-	View
exp_self.20260508151159.245_20260508_151200 Paper: self.20260508151159.245	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508151159.245 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 15:13	Success	-	View
exp_self.20260508150344.244_20260508_150344 Paper: self.20260508150344.244	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508150344.244 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 15:04	Success	-	View
exp_self.20260508145543.243_20260508_145544 Paper: self.20260508145543.243	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508145543.243 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 14:56	Success	-	View
exp_self.20260508144730.242_20260508_144730 Paper: self.20260508144730.242	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508144730.242 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 14:48	Success	-	View
exp_pytrain.20260508144401.061_20260508_144401 Paper: pytrain.20260508144401.061	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 14:45	Success	-	View
exp_self.20260508143814.241_20260508_143814 Paper: self.20260508143814.241	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508143814.241 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 14:39	Success	-	View
exp_self.20260508143020.240_20260508_143021 Paper: self.20260508143020.240	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508143020.240 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 14:31	Success	-	View
exp_self.20260508142224.239_20260508_142224 Paper: self.20260508142224.239	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508142224.239 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 14:23	Success	-	View
exp_self.20260508141453.238_20260508_141453 Paper: self.20260508141453.238	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508141453.238 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 14:15	Success	-	View
exp_pytrain.20260508141155.060_20260508_141156 Paper: pytrain.20260508141155.060	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 14:12	Success	-	View
exp_self.20260508140704.237_20260508_140705 Paper: self.20260508140704.237	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508140704.237 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 14:08	Success	-	View
exp_self.20260508135851.236_20260508_135851 Paper: self.20260508135851.236	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508135851.236 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 13:59	Success	-	View
exp_self.20260508135033.235_20260508_135033 Paper: self.20260508135033.235	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508135033.235 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 13:51	Success	-	View
exp_self.20260508134259.234_20260508_134259 Paper: self.20260508134259.234	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508134259.234 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 13:44	Success	-	View
exp_pytrain.20260508133911.059_20260508_133912 Paper: pytrain.20260508133911.059	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 13:40	Success	-	View
exp_self.20260508133200.233_20260508_133201 Paper: self.20260508133200.233	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508133200.233 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 13:33	Success	-	View
exp_self.20260508132453.232_20260508_132454 Paper: self.20260508132453.232	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508132453.232 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 13:25	Success	-	View
exp_self.20260508131645.231_20260508_131646 Paper: self.20260508131645.231	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508131645.231 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 13:17	Success	-	View
exp_self.20260508130934.230_20260508_130934 Paper: self.20260508130934.230	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508130934.230 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 13:10	Success	-	View
exp_pytrain.20260508130559.058_20260508_130559 Paper: pytrain.20260508130559.058	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 13:07	Success	-	View
exp_self.20260508125845.229_20260508_125845 Paper: self.20260508125845.229	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508125845.229 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 12:59	Success	-	View
exp_self.20260508125034.228_20260508_125034 Paper: self.20260508125034.228	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508125034.228 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 12:51	Success	-	View
exp_self.20260508124342.227_20260508_124342 Paper: self.20260508124342.227	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508124342.227 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 12:44	Success	-	View
exp_self.20260508123623.226_20260508_123623 Paper: self.20260508123623.226	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508123623.226 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 12:37	Success	-	View
exp_pytrain.20260508123326.057_20260508_123327 Paper: pytrain.20260508123326.057	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 12:34	Success	-	View
exp_self.20260508122700.225_20260508_122701 Paper: self.20260508122700.225	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508122700.225 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 12:28	Success	-	View
exp_self.20260508122001.224_20260508_122001 Paper: self.20260508122001.224	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508122001.224 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 12:21	Success	-	View
exp_self.20260508121303.223_20260508_121303 Paper: self.20260508121303.223	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508121303.223 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 12:14	Success	-	View
exp_self.20260508120509.222_20260508_120509 Paper: self.20260508120509.222	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508120509.222 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 12:06	Success	-	View
exp_pytrain.20260508120206.056_20260508_120206 Paper: pytrain.20260508120206.056	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 12:03	Success	-	View
exp_self.20260508115545.221_20260508_115546 Paper: self.20260508115545.221	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508115545.221 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 11:56	Success	-	View
exp_self.20260508114830.220_20260508_114831 Paper: self.20260508114830.220	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508114830.220 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 11:49	Success	-	View
exp_self.20260508114107.219_20260508_114107 Paper: self.20260508114107.219	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508114107.219 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 11:42	Success	-	View
exp_self.20260508113419.218_20260508_113420 Paper: self.20260508113419.218	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508113419.218 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 11:35	Success	-	View
exp_pytrain.20260508112933.055_20260508_112933 Paper: pytrain.20260508112933.055	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 11:30	Success	-	View
exp_self.20260508112632.217_20260508_112632 Paper: self.20260508112632.217	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508112632.217 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 11:27	Success	-	View
exp_self.20260508111824.216_20260508_111824 Paper: self.20260508111824.216	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508111824.216 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 11:19	Success	-	View
exp_self.20260508110954.215_20260508_110954 Paper: self.20260508110954.215	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508110954.215 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 11:10	Success	-	View
exp_self.20260508110158.214_20260508_110158 Paper: self.20260508110158.214	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508110158.214 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 11:03	Success	-	View
exp_pytrain.20260508105634.054_20260508_105635 Paper: pytrain.20260508105634.054	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 10:57	Success	-	View
exp_self.20260508105329.213_20260508_105330 Paper: self.20260508105329.213	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508105329.213 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 10:54	Success	-	View
exp_self.20260508104501.212_20260508_104502 Paper: self.20260508104501.212	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508104501.212 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 10:46	Success	-	View
exp_self.20260508103635.211_20260508_103636 Paper: self.20260508103635.211	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508103635.211 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 10:37	Success	-	View
exp_self.20260508102828.210_20260508_102828 Paper: self.20260508102828.210	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508102828.210 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 10:29	Success	-	View
exp_pytrain.20260508102450.053_20260508_102450 Paper: pytrain.20260508102450.053	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 10:25	Success	-	View
exp_self.20260508102043.209_20260508_102043 Paper: self.20260508102043.209	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508102043.209 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 10:21	Success	-	View
exp_self.20260508101214.208_20260508_101214 Paper: self.20260508101214.208	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508101214.208 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 10:13	Success	-	View
exp_self.20260508100409.207_20260508_100409 Paper: self.20260508100409.207	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508100409.207 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 10:05	Success	-	View
exp_self.20260508095552.206_20260508_095552 Paper: self.20260508095552.206	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508095552.206 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 09:56	Success	-	View
exp_pytrain.20260508095210.052_20260508_095210 Paper: pytrain.20260508095210.052	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 09:53	Success	-	View
exp_self.20260508094626.205_20260508_094626 Paper: self.20260508094626.205	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508094626.205 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 09:47	Success	-	View
exp_self.20260508093836.204_20260508_093836 Paper: self.20260508093836.204	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508093836.204 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 09:39	Success	-	View
exp_self.20260508093037.203_20260508_093037 Paper: self.20260508093037.203	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508093037.203 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 09:31	Success	-	View
exp_self.20260508092248.202_20260508_092249 Paper: self.20260508092248.202	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508092248.202 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 09:23	Success	-	View
exp_pytrain.20260508091957.051_20260508_091957 Paper: pytrain.20260508091957.051	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 09:20	Success	-	View
exp_self.20260508091420.201_20260508_091420 Paper: self.20260508091420.201	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508091420.201 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 09:15	Success	-	View
exp_self.20260508090539.200_20260508_090540 Paper: self.20260508090539.200	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508090539.200 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 09:06	Success	-	View
exp_self.20260508085824.199_20260508_085825 Paper: self.20260508085824.199	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508085824.199 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 08:59	Success	-	View
exp_self.20260508085104.198_20260508_085105 Paper: self.20260508085104.198	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508085104.198 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 08:52	Success	-	View
exp_pytrain.20260508084806.050_20260508_084807 Paper: pytrain.20260508084806.050	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 08:49	Success	-	View
exp_self.20260508084329.197_20260508_084329 Paper: self.20260508084329.197	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508084329.197 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 08:44	Success	-	View
exp_self.20260508083605.196_20260508_083605 Paper: self.20260508083605.196	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508083605.196 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 08:37	Success	-	View
exp_self.20260508082858.195_20260508_082858 Paper: self.20260508082858.195	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508082858.195 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 08:30	Success	-	View
exp_self.20260508082121.194_20260508_082121 Paper: self.20260508082121.194	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508082121.194 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 08:22	Success	-	View
exp_pytrain.20260508081601.049_20260508_081601 Paper: pytrain.20260508081601.049	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 08:17	Success	-	View
exp_self.20260508081337.193_20260508_081338 Paper: self.20260508081337.193	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508081337.193 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 08:14	Success	-	View
exp_self.20260508080628.192_20260508_080629 Paper: self.20260508080628.192	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508080628.192 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 08:07	Success	-	View
exp_self.20260508075924.191_20260508_075924 Paper: self.20260508075924.191	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508075924.191 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 08:00	Success	-	View
exp_self.20260508075240.190_20260508_075241 Paper: self.20260508075240.190	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508075240.190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 07:53	Success	-	View
exp_self.20260508074543.189_20260508_074543 Paper: self.20260508074543.189	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508074543.189 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 07:46	Success	-	View
exp_pytrain.20260508074259.048_20260508_074300 Paper: pytrain.20260508074259.048	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 07:44	Success	-	View
exp_self.20260508073629.188_20260508_073630 Paper: self.20260508073629.188	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508073629.188 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 07:37	Success	-	View
exp_self.20260508072929.187_20260508_072930 Paper: self.20260508072929.187	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508072929.187 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 07:30	Success	-	View
exp_self.20260508072236.186_20260508_072236 Paper: self.20260508072236.186	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508072236.186 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 07:23	Success	-	View
exp_self.20260508071538.185_20260508_071538 Paper: self.20260508071538.185	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508071538.185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 07:16	Success	-	View
exp_pytrain.20260508071136.047_20260508_071136 Paper: pytrain.20260508071136.047	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 07:12	Success	-	View
exp_self.20260508070800.184_20260508_070800 Paper: self.20260508070800.184	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508070800.184 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 07:09	Success	-	View
exp_self.20260508070058.183_20260508_070059 Paper: self.20260508070058.183	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508070058.183 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 07:02	Success	-	View
exp_self.20260508065226.182_20260508_065226 Paper: self.20260508065226.182	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508065226.182 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 06:53	Success	-	View
exp_self.20260508064516.181_20260508_064525 Paper: self.20260508064516.181	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508064516.181 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 06:46	Success	-	View
exp_pytrain.20260508063943.046_20260508_063944 Paper: pytrain.20260508063943.046	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 06:40	Success	-	View
exp_self.20260508063726.180_20260508_063727 Paper: self.20260508063726.180	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508063726.180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 06:38	Success	-	View
exp_self.20260508062737.179_20260508_062737 Paper: self.20260508062737.179	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508062737.179 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 06:28	Success	-	View
exp_hf_2605.04045_20260508_062411 Paper: hf_2605.04045	Audio-Visual Intelligence in Large Foundation Models Paper ID: hf_2605.04045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-08 06:25	Success	-	View
exp_self.20260508061725.178_20260508_061725 Paper: self.20260508061725.178	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508061725.178 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 06:18	Success	-	View
exp_self.20260508061007.177_20260508_061007 Paper: self.20260508061007.177	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508061007.177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 06:11	Success	-	View
exp_pytrain.20260508060640.045_20260508_060641 Paper: pytrain.20260508060640.045	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 06:07	Success	-	View
exp_hf_2605.05758_20260508_060320 Paper: hf_2605.05758	BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models Paper ID: hf_2605.05758 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-08 06:04	Success	-	View
exp_self.20260508055913.176_20260508_055913 Paper: self.20260508055913.176	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508055913.176 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 06:00	Success	-	View
exp_self.20260508055200.175_20260508_055201 Paper: self.20260508055200.175	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508055200.175 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 05:53	Success	-	View
exp_self.20260508054445.174_20260508_054445 Paper: self.20260508054445.174	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508054445.174 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 05:45	Success	-	View
exp_self.20260508053712.173_20260508_053712 Paper: self.20260508053712.173	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508053712.173 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 05:38	Success	-	View
exp_pytrain.20260508053352.044_20260508_053353 Paper: pytrain.20260508053352.044	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 05:34	Success	-	View
exp_self.20260508052705.172_20260508_052706 Paper: self.20260508052705.172	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508052705.172 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 05:28	Success	-	View
exp_self.20260508051956.171_20260508_051957 Paper: self.20260508051956.171	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508051956.171 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 05:21	Success	-	View
exp_self.20260508051237.170_20260508_051238 Paper: self.20260508051237.170	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508051237.170 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 05:13	Success	-	View
exp_self.20260508050519.169_20260508_050519 Paper: self.20260508050519.169	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508050519.169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 05:06	Success	-	View
exp_pytrain.20260508050157.043_20260508_050158 Paper: pytrain.20260508050157.043	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 05:03	Success	-	View
exp_self.20260508045751.168_20260508_045751 Paper: self.20260508045751.168	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508045751.168 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 04:58	Success	-	View
exp_self.20260508045034.167_20260508_045035 Paper: self.20260508045034.167	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508045034.167 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 04:51	Success	-	View
exp_self.20260508044321.166_20260508_044322 Paper: self.20260508044321.166	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508044321.166 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 04:44	Success	-	View
exp_self.20260508043601.165_20260508_043602 Paper: self.20260508043601.165	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508043601.165 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 04:37	Success	-	View
exp_hf_2605.04956_20260508_043221 Paper: hf_2605.04956	KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels Paper ID: hf_2605.04956 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-08 04:33	Success	-	View
exp_pytrain.20260508042925.042_20260508_042926 Paper: pytrain.20260508042925.042	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 04:30	Success	-	View
exp_self.20260508042242.164_20260508_042242 Paper: self.20260508042242.164	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508042242.164 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 04:23	Success	-	View
exp_self.20260508041522.163_20260508_041522 Paper: self.20260508041522.163	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508041522.163 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 04:16	Success	-	View
exp_self.20260508040802.162_20260508_040803 Paper: self.20260508040802.162	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508040802.162 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 04:09	Success	-	View
exp_self.20260508040043.161_20260508_040044 Paper: self.20260508040043.161	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508040043.161 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 04:01	Success	-	View
exp_pytrain.20260508035723.041_20260508_035723 Paper: pytrain.20260508035723.041	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 03:58	Success	-	View
exp_self.20260508035039.160_20260508_035039 Paper: self.20260508035039.160	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508035039.160 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 03:51	Success	-	View
exp_self.20260508034326.159_20260508_034326 Paper: self.20260508034326.159	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508034326.159 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 03:44	Success	-	View
exp_self.20260508033609.158_20260508_033610 Paper: self.20260508033609.158	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508033609.158 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 03:37	Success	-	View
exp_self.20260508032851.157_20260508_032852 Paper: self.20260508032851.157	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508032851.157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 03:29	Success	-	View
exp_pytrain.20260508032531.040_20260508_032532 Paper: pytrain.20260508032531.040	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 03:26	Success	-	View
exp_self.20260508031842.156_20260508_031843 Paper: self.20260508031842.156	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508031842.156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 03:19	Success	-	View
exp_self.20260508031131.155_20260508_031131 Paper: self.20260508031131.155	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508031131.155 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 03:12	Success	-	View
exp_self.20260508030417.154_20260508_030417 Paper: self.20260508030417.154	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508030417.154 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 03:05	Success	-	View
exp_self.20260508025656.153_20260508_025657 Paper: self.20260508025656.153	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508025656.153 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 02:58	Success	-	View
exp_pytrain.20260508025335.039_20260508_025336 Paper: pytrain.20260508025335.039	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 02:54	Success	-	View
exp_self.20260508024825.152_20260508_024826 Paper: self.20260508024825.152	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508024825.152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 02:49	Success	-	View
exp_self.20260508024059.151_20260508_024059 Paper: self.20260508024059.151	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508024059.151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 02:42	Success	-	View
exp_self.20260508023344.150_20260508_023345 Paper: self.20260508023344.150	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508023344.150 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 02:34	Success	-	View
exp_hf_2605.06216_20260508_022843 Paper: hf_2605.06216	TIDE: Every Layer Knows the Token Beneath the Context Paper ID: hf_2605.06216 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-08 02:29	Success	-	View
exp_self.20260508022546.149_20260508_022547 Paper: self.20260508022546.149	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508022546.149 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 02:26	Success	-	View
exp_pytrain.20260508022107.038_20260508_022108 Paper: pytrain.20260508022107.038	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 02:22	Success	-	View
exp_self.20260508021707.148_20260508_021707 Paper: self.20260508021707.148	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508021707.148 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 02:18	Success	-	View
exp_self.20260508020750.147_20260508_020751 Paper: self.20260508020750.147	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508020750.147 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 02:08	Success	-	View
exp_self.20260508020038.146_20260508_020038 Paper: self.20260508020038.146	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508020038.146 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 02:01	Success	-	View
exp_self.20260508015330.145_20260508_015331 Paper: self.20260508015330.145	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508015330.145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 01:54	Success	-	View
exp_pytrain.20260508014859.037_20260508_014859 Paper: pytrain.20260508014859.037	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 01:50	Success	-	View
exp_self.20260508014606.144_20260508_014606 Paper: self.20260508014606.144	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508014606.144 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 01:47	Success	-	View
exp_self.20260508013847.143_20260508_013848 Paper: self.20260508013847.143	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508013847.143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 01:39	Success	-	View
exp_self.20260508013017.142_20260508_013017 Paper: self.20260508013017.142	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508013017.142 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 01:31	Success	-	View
exp_self.20260508012305.141_20260508_012306 Paper: self.20260508012305.141	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508012305.141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 01:24	Success	-	View
exp_cr_10.3389_frai.2026.1760246_20260508_011940 Paper: cr_10.3389_frai.2026.1760246	Language-based personality assessment from life narratives: a focus on model interpretability and efficiency Paper ID: cr_10.3389_frai.2026.1760246 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recov...	05-08 01:20	Success	-	View
exp_pytrain.20260508011643.036_20260508_011643 Paper: pytrain.20260508011643.036	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 01:17	Success	-	View
exp_self.20260508011134.140_20260508_011134 Paper: self.20260508011134.140	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508011134.140 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 01:12	Success	-	View
exp_self.20260508010338.139_20260508_010339 Paper: self.20260508010338.139	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508010338.139 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 01:04	Success	-	View
exp_self.20260508005624.138_20260508_005625 Paper: self.20260508005624.138	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508005624.138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 00:57	Success	-	View
exp_self.20260508004911.137_20260508_004912 Paper: self.20260508004911.137	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508004911.137 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 00:50	Success	-	View
exp_pytrain.20260508004440.035_20260508_004440 Paper: pytrain.20260508004440.035	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 00:45	Success	-	View
exp_self.20260508004148.136_20260508_004148 Paper: self.20260508004148.136	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508004148.136 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 00:42	Success	-	View
exp_self.20260508003319.135_20260508_003319 Paper: self.20260508003319.135	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508003319.135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 00:34	Success	-	View
exp_self.20260508002605.134_20260508_002605 Paper: self.20260508002605.134	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508002605.134 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 00:27	Success	-	View
exp_cr_10.3389_fendo.2026.1776707_20260508_002240 Paper: cr_10.3389_fendo.2026.1776707	Global knowledge graph of osteoporosis biomarkers based on large language model embeddings and complex network algorithm... Paper ID: cr_10.3389_fendo.2026.1776707 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	05-08 00:23	Success	-	View
exp_cr_10.3389_fmed.2026.1817215_20260508_001814 Paper: cr_10.3389_fmed.2026.1817215	Low-energy small language models with retrieval-augmented generation can surpass large-model performance in rheumatology Paper ID: cr_10.3389_fmed.2026.1817215 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recov...	05-08 00:19	Success	-	View
exp_self.20260508001513.133_20260508_001513 Paper: self.20260508001513.133	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508001513.133 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 00:16	Success	-	View
exp_pytrain.20260508001153.034_20260508_001154 Paper: pytrain.20260508001153.034	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-08 00:12	Success	-	View
exp_self.20260508000505.132_20260508_000506 Paper: self.20260508000505.132	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260508000505.132 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-08 00:06	Success	-	View
exp_self.20260507235702.131_20260507_235702 Paper: self.20260507235702.131	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507235702.131 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 23:58	Success	-	View
exp_self.20260507234936.130_20260507_234936 Paper: self.20260507234936.130	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507234936.130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 23:50	Success	-	View
exp_hf_2605.04451_20260507_234500 Paper: hf_2605.04451	RemoteZero: Geospatial Reasoning with Zero Human Annotations Paper ID: hf_2605.04451 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 23:46	Success	-	View
exp_self.20260507234249.129_20260507_234250 Paper: self.20260507234249.129	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507234249.129 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 23:43	Success	-	View
exp_pytrain.20260507234010.033_20260507_234010 Paper: pytrain.20260507234010.033	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 23:41	Success	-	View
exp_self.20260507233311.128_20260507_233311 Paper: self.20260507233311.128	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507233311.128 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 23:34	Success	-	View
exp_self.20260507232531.127_20260507_232531 Paper: self.20260507232531.127	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507232531.127 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 23:26	Success	-	View
exp_self.20260507231755.126_20260507_231755 Paper: self.20260507231755.126	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507231755.126 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 23:18	Success	-	View
exp_self.20260507231019.125_20260507_231020 Paper: self.20260507231019.125	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507231019.125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 23:11	Success	-	View
exp_pytrain.20260507230746.032_20260507_230746 Paper: pytrain.20260507230746.032	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 23:08	Success	-	View
exp_self.20260507230214.124_20260507_230214 Paper: self.20260507230214.124	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507230214.124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 23:03	Success	-	View
exp_self.20260507225439.123_20260507_225439 Paper: self.20260507225439.123	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507225439.123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 22:55	Success	-	View
exp_self.20260507224704.122_20260507_224704 Paper: self.20260507224704.122	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507224704.122 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 22:48	Success	-	View
exp_hf_2605.06222_20260507_224341 Paper: hf_2605.06222	When to Trust Imagination: Adaptive Action Execution for World Action Models Paper ID: hf_2605.06222 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 22:44	Success	-	View
exp_self.20260507223814.121_20260507_223814 Paper: self.20260507223814.121	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507223814.121 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 22:39	Success	-	View
exp_pytrain.20260507223534.031_20260507_223534 Paper: pytrain.20260507223534.031	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 22:36	Success	-	View
exp_self.20260507223009.120_20260507_223010 Paper: self.20260507223009.120	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507223009.120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 22:31	Success	-	View
exp_hf_2605.04647_20260507_222646 Paper: hf_2605.04647	ReflectDrive-2: Reinforcement-Learning-Aligned Self-Editing for Discrete Diffusion Driving Paper ID: hf_2605.04647 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 22:27	Success	-	View
exp_self.20260507222118.119_20260507_222118 Paper: self.20260507222118.119	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507222118.119 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 22:22	Success	-	View
exp_hf_2605.06376_20260507_221820 Paper: hf_2605.06376	Continuous-Time Distribution Matching for Few-Step Diffusion Distillation Paper ID: hf_2605.06376 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 22:19	Success	-	View
exp_hf_2605.06356_20260507_221416 Paper: hf_2605.06356	SwiftI2V: Efficient High-Resolution Image-to-Video Generation via Conditional Segment-wise Generation Paper ID: hf_2605.06356 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 22:15	Success	-	View
exp_self.20260507221205.118_20260507_221206 Paper: self.20260507221205.118	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507221205.118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 22:13	Success	-	View
exp_hf_2605.06200_20260507_220843 Paper: hf_2605.06200	A^2TGPO: Agentic Turn-Group Policy Optimization with Adaptive Turn-level Clipping Paper ID: hf_2605.06200 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 22:09	Success	-	View
exp_2605.06664v1_20260507_220620 Paper: 2605.06664v1	BAMI: Training-Free Bias Mitigation in GUI Grounding Paper ID: 2605.06664v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-07 22:07	Success	-	View
exp_pytrain.20260507220406.030_20260507_220406 Paper: pytrain.20260507220406.030	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 22:05	Success	-	View
exp_self.20260507220159.117_20260507_220159 Paper: self.20260507220159.117	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507220159.117 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 22:03	Success	-	View
exp_2605.06663v1_20260507_215836 Paper: 2605.06663v1	EMO: Pretraining Mixture of Experts for Emergent Modularity Paper ID: 2605.06663v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-07 21:59	Success	-	View
exp_self.20260507215301.116_20260507_215302 Paper: self.20260507215301.116	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507215301.116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 21:54	Success	-	View
exp_2605.06665v1_20260507_214944 Paper: 2605.06665v1	UniPool: A Globally Shared Expert Pool for Mixture-of-Experts Paper ID: 2605.06665v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-07 21:50	Success	-	View
exp_self.20260507214324.115_20260507_214325 Paper: self.20260507214324.115	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507214324.115 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 21:44	Success	-	View
exp_self.20260507213551.114_20260507_213552 Paper: self.20260507213551.114	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507213551.114 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 21:36	Success	-	View
exp_hf_2605.05922_20260507_213253 Paper: hf_2605.05922	Think, then Score: Decoupled Reasoning and Scoring for Video Reward Modeling Paper ID: hf_2605.05922 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 21:33	Success	-	View
exp_pytrain.20260507213042.029_20260507_213042 Paper: pytrain.20260507213042.029	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 21:31	Success	-	View
exp_self.20260507212727.113_20260507_212727 Paper: self.20260507212727.113	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507212727.113 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 21:28	Success	-	View
exp_hf_2605.06665_20260507_212432 Paper: hf_2605.06665	UniPool: A Globally Shared Expert Pool for Mixture-of-Experts Paper ID: hf_2605.06665 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 21:25	Success	-	View
exp_hf_2605.06548_20260507_212028 Paper: hf_2605.06548	Continuous Latent Diffusion Language Model Paper ID: hf_2605.06548 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 21:21	Success	-	View
exp_self.20260507211819.112_20260507_211819 Paper: self.20260507211819.112	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507211819.112 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 21:19	Success	-	View
exp_self.20260507211038.111_20260507_211039 Paper: self.20260507211038.111	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507211038.111 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 21:11	Success	-	View
exp_2605.06225v1_20260507_210713 Paper: 2605.06225v1	Memory Inception: Latent-Space KV Cache Manipulation for Steering LLMs Paper ID: 2605.06225v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-07 21:08	Success	-	View
exp_self.20260507210349.110_20260507_210349 Paper: self.20260507210349.110	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507210349.110 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 21:04	Success	-	View
exp_pytrain.20260507205859.028_20260507_205859 Paper: pytrain.20260507205859.028	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 21:00	Success	-	View
exp_self.20260507205653.109_20260507_205653 Paper: self.20260507205653.109	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507205653.109 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 20:57	Success	-	View
exp_2605.06230v1_20260507_205222 Paper: 2605.06230v1	Safactory: A Scalable Agent Factory for Trustworthy Autonomous Intelligence Paper ID: 2605.06230v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-07 20:53	Success	-	View
exp_self.20260507205008.108_20260507_205008 Paper: self.20260507205008.108	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507205008.108 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 20:51	Success	-	View
exp_2605.06229v1_20260507_204646 Paper: 2605.06229v1	Look Beyond Saliency: Low-Attention Guided Dual Encoding for Video Semantic Search Paper ID: 2605.06229v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-07 20:47	Success	-	View
exp_self.20260507204322.107_20260507_204322 Paper: self.20260507204322.107	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507204322.107 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 20:44	Success	-	View
exp_self.20260507203529.106_20260507_203529 Paper: self.20260507203529.106	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507203529.106 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 20:36	Success	-	View
exp_self.20260507202758.105_20260507_202758 Paper: self.20260507202758.105	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507202758.105 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 20:29	Success	-	View
exp_pytrain.20260507202522.027_20260507_202523 Paper: pytrain.20260507202522.027	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 20:26	Success	-	View
exp_self.20260507201911.104_20260507_201912 Paper: self.20260507201911.104	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507201911.104 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 20:20	Success	-	View
exp_self.20260507201131.103_20260507_201131 Paper: self.20260507201131.103	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507201131.103 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 20:12	Success	-	View
exp_self.20260507200355.102_20260507_200356 Paper: self.20260507200355.102	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507200355.102 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 20:04	Success	-	View
exp_self.20260507195611.101_20260507_195612 Paper: self.20260507195611.101	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507195611.101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 19:57	Success	-	View
exp_pytrain.20260507195336.026_20260507_195337 Paper: pytrain.20260507195336.026	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 19:54	Success	-	View
exp_self.20260507194808.100_20260507_194809 Paper: self.20260507194808.100	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507194808.100 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 19:49	Success	-	View
exp_self.20260507194016.099_20260507_194017 Paper: self.20260507194016.099	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507194016.099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 19:41	Success	-	View
exp_self.20260507193226.098_20260507_193226 Paper: self.20260507193226.098	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507193226.098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 19:33	Success	-	View
exp_self.20260507192444.097_20260507_192445 Paper: self.20260507192444.097	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507192444.097 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 19:25	Success	-	View
exp_pytrain.20260507192206.025_20260507_192207 Paper: pytrain.20260507192206.025	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 19:23	Success	-	View
exp_self.20260507191508.096_20260507_191509 Paper: self.20260507191508.096	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507191508.096 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 19:16	Success	-	View
exp_self.20260507190726.095_20260507_190727 Paper: self.20260507190726.095	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507190726.095 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 19:08	Success	-	View
exp_self.20260507190034.094_20260507_190034 Paper: self.20260507190034.094	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507190034.094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 19:01	Success	-	View
exp_self.20260507185226.093_20260507_185226 Paper: self.20260507185226.093	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507185226.093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 18:53	Success	-	View
exp_pytrain.20260507184947.024_20260507_184948 Paper: pytrain.20260507184947.024	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 18:50	Success	-	View
exp_self.20260507184252.092_20260507_184253 Paper: self.20260507184252.092	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507184252.092 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 18:43	Success	-	View
exp_self.20260507183513.091_20260507_183513 Paper: self.20260507183513.091	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507183513.091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 18:36	Success	-	View
exp_self.20260507182731.090_20260507_182732 Paper: self.20260507182731.090	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507182731.090 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 18:28	Success	-	View
exp_self.20260507181957.089_20260507_181957 Paper: self.20260507181957.089	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507181957.089 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 18:21	Success	-	View
exp_pytrain.20260507181722.023_20260507_181723 Paper: pytrain.20260507181722.023	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 18:18	Success	-	View
exp_self.20260507181151.088_20260507_181151 Paper: self.20260507181151.088	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507181151.088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 18:12	Success	-	View
exp_self.20260507180417.087_20260507_180417 Paper: self.20260507180417.087	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507180417.087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 18:05	Success	-	View
exp_self.20260507175643.086_20260507_175643 Paper: self.20260507175643.086	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507175643.086 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 17:57	Success	-	View
exp_self.20260507174801.085_20260507_174801 Paper: self.20260507174801.085	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507174801.085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 17:49	Success	-	View
exp_pytrain.20260507174523.022_20260507_174523 Paper: pytrain.20260507174523.022	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 17:46	Success	-	View
exp_self.20260507173816.084_20260507_173817 Paper: self.20260507173816.084	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507173816.084 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 17:39	Success	-	View
exp_self.20260507173042.083_20260507_173043 Paper: self.20260507173042.083	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507173042.083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 17:31	Success	-	View
exp_self.20260507172300.082_20260507_172301 Paper: self.20260507172300.082	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507172300.082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 17:24	Success	-	View
exp_self.20260507171547.081_20260507_171548 Paper: self.20260507171547.081	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507171547.081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 17:16	Success	-	View
exp_pytrain.20260507171227.021_20260507_171227 Paper: pytrain.20260507171227.021	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 17:13	Success	-	View
exp_self.20260507170539.080_20260507_170540 Paper: self.20260507170539.080	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507170539.080 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 17:06	Success	-	View
exp_self.20260507165753.079_20260507_165754 Paper: self.20260507165753.079	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507165753.079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 16:58	Success	-	View
exp_self.20260507165016.078_20260507_165016 Paper: self.20260507165016.078	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507165016.078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 16:51	Success	-	View
exp_self.20260507164259.077_20260507_164300 Paper: self.20260507164259.077	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507164259.077 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 16:44	Success	-	View
exp_pytrain.20260507163939.020_20260507_163939 Paper: pytrain.20260507163939.020	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 16:40	Success	-	View
exp_self.20260507163254.076_20260507_163254 Paper: self.20260507163254.076	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507163254.076 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 16:33	Success	-	View
exp_self.20260507162541.075_20260507_162541 Paper: self.20260507162541.075	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507162541.075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 16:26	Success	-	View
exp_self.20260507161834.074_20260507_161834 Paper: self.20260507161834.074	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507161834.074 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 16:19	Success	-	View
exp_self.20260507161120.073_20260507_161120 Paper: self.20260507161120.073	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507161120.073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 16:12	Success	-	View
exp_pytrain.20260507160754.019_20260507_160755 Paper: pytrain.20260507160754.019	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 16:08	Success	-	View
exp_self.20260507160110.072_20260507_160111 Paper: self.20260507160110.072	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507160110.072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 16:02	Success	-	View
exp_self.20260507155356.071_20260507_155356 Paper: self.20260507155356.071	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507155356.071 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 15:55	Success	-	View
exp_self.20260507154645.070_20260507_154645 Paper: self.20260507154645.070	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507154645.070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 15:47	Success	-	View
exp_self.20260507153911.069_20260507_153912 Paper: self.20260507153911.069	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507153911.069 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 15:40	Success	-	View
exp_pytrain.20260507153551.018_20260507_153552 Paper: pytrain.20260507153551.018	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 15:36	Success	-	View
exp_self.20260507152907.068_20260507_152907 Paper: self.20260507152907.068	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507152907.068 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 15:30	Success	-	View
exp_self.20260507152159.067_20260507_152159 Paper: self.20260507152159.067	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507152159.067 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 15:23	Success	-	View
exp_self.20260507151442.066_20260507_151442 Paper: self.20260507151442.066	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507151442.066 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 15:15	Success	-	View
exp_self.20260507150717.065_20260507_150717 Paper: self.20260507150717.065	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507150717.065 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 15:08	Success	-	View
exp_pytrain.20260507150358.017_20260507_150358 Paper: pytrain.20260507150358.017	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 15:05	Success	-	View
exp_self.20260507145710.064_20260507_145711 Paper: self.20260507145710.064	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507145710.064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 14:58	Success	-	View
exp_self.20260507145000.063_20260507_145000 Paper: self.20260507145000.063	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507145000.063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 14:51	Success	-	View
exp_self.20260507144241.062_20260507_144242 Paper: self.20260507144241.062	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507144241.062 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 14:43	Success	-	View
exp_self.20260507143529.061_20260507_143529 Paper: self.20260507143529.061	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507143529.061 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 14:36	Success	-	View
exp_pytrain.20260507143138.016_20260507_143139 Paper: pytrain.20260507143138.016	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 14:32	Success	-	View
exp_self.20260507142452.060_20260507_142453 Paper: self.20260507142452.060	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507142452.060 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 14:25	Success	-	View
exp_self.20260507141737.059_20260507_141737 Paper: self.20260507141737.059	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507141737.059 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 14:18	Success	-	View
exp_self.20260507141031.058_20260507_141031 Paper: self.20260507141031.058	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507141031.058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 14:11	Success	-	View
exp_self.20260507140313.057_20260507_140313 Paper: self.20260507140313.057	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507140313.057 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 14:04	Success	-	View
exp_pytrain.20260507135945.015_20260507_135946 Paper: pytrain.20260507135945.015	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 14:00	Success	-	View
exp_self.20260507135303.056_20260507_135303 Paper: self.20260507135303.056	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507135303.056 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 13:54	Success	-	View
exp_self.20260507134548.055_20260507_134548 Paper: self.20260507134548.055	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507134548.055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 13:46	Success	-	View
exp_self.20260507133839.054_20260507_133839 Paper: self.20260507133839.054	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507133839.054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 13:39	Success	-	View
exp_self.20260507133125.053_20260507_133125 Paper: self.20260507133125.053	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507133125.053 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 13:32	Success	-	View
exp_pytrain.20260507132755.014_20260507_132756 Paper: pytrain.20260507132755.014	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 13:28	Success	-	View
exp_self.20260507132116.052_20260507_132117 Paper: self.20260507132116.052	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507132116.052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 13:22	Success	-	View
exp_self.20260507131402.051_20260507_131402 Paper: self.20260507131402.051	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507131402.051 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 13:15	Success	-	View
exp_self.20260507130648.050_20260507_130648 Paper: self.20260507130648.050	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507130648.050 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 13:07	Success	-	View
exp_self.20260507125939.049_20260507_125939 Paper: self.20260507125939.049	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507125939.049 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 13:00	Success	-	View
exp_pytrain.20260507125612.013_20260507_125612 Paper: pytrain.20260507125612.013	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 12:57	Success	-	View
exp_self.20260507124932.048_20260507_124932 Paper: self.20260507124932.048	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507124932.048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 12:50	Success	-	View
exp_self.20260507124217.047_20260507_124218 Paper: self.20260507124217.047	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507124217.047 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 12:43	Success	-	View
exp_self.20260507123505.046_20260507_123505 Paper: self.20260507123505.046	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507123505.046 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 12:36	Success	-	View
exp_self.20260507122750.045_20260507_122751 Paper: self.20260507122750.045	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507122750.045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 12:28	Success	-	View
exp_pytrain.20260507122423.012_20260507_122423 Paper: pytrain.20260507122423.012	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 12:25	Success	-	View
exp_self.20260507121740.044_20260507_121740 Paper: self.20260507121740.044	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507121740.044 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 12:18	Success	-	View
exp_self.20260507121026.043_20260507_121026 Paper: self.20260507121026.043	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507121026.043 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 12:11	Success	-	View
exp_self.20260507120244.042_20260507_120244 Paper: self.20260507120244.042	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507120244.042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 12:03	Success	-	View
exp_self.20260507115513.041_20260507_115513 Paper: self.20260507115513.041	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507115513.041 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 11:56	Success	-	View
exp_pytrain.20260507115238.011_20260507_115239 Paper: pytrain.20260507115238.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 11:53	Success	-	View
exp_self.20260507114717.040_20260507_114717 Paper: self.20260507114717.040	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507114717.040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 11:48	Success	-	View
exp_self.20260507113947.039_20260507_113947 Paper: self.20260507113947.039	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507113947.039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 11:40	Success	-	View
exp_hf_2605.02910_20260507_113626 Paper: hf_2605.02910	CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing Paper ID: hf_2605.02910 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 11:37	Success	-	View
exp_self.20260507113055.038_20260507_113056 Paper: self.20260507113055.038	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507113055.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 11:31	Success	-	View
exp_self.20260507112316.037_20260507_112317 Paper: self.20260507112316.037	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507112316.037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 11:24	Success	-	View
exp_pytrain.20260507112048.010_20260507_112049 Paper: pytrain.20260507112048.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 11:21	Success	-	View
exp_self.20260507111154.036_20260507_111155 Paper: self.20260507111154.036	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507111154.036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 11:12	Success	-	View
exp_self.20260507110430.035_20260507_110430 Paper: self.20260507110430.035	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507110430.035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 11:05	Success	-	View
exp_self.20260507105742.034_20260507_105742 Paper: self.20260507105742.034	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507105742.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 10:58	Success	-	View
exp_self.20260507105028.033_20260507_105029 Paper: self.20260507105028.033	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507105028.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 10:51	Success	-	View
exp_pytrain.20260507104755.009_20260507_104755 Paper: pytrain.20260507104755.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 10:48	Success	-	View
exp_self.20260507104101.032_20260507_104102 Paper: self.20260507104101.032	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507104101.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 10:42	Success	-	View
exp_self.20260507103323.031_20260507_103323 Paper: self.20260507103323.031	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507103323.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 10:34	Success	-	View
exp_self.20260507102549.030_20260507_102550 Paper: self.20260507102549.030	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507102549.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 10:26	Success	-	View
exp_self.20260507101812.029_20260507_101813 Paper: self.20260507101812.029	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507101812.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 10:19	Success	-	View
exp_pytrain.20260507101544.008_20260507_101544 Paper: pytrain.20260507101544.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 10:16	Success	-	View
exp_self.20260507100837.028_20260507_100837 Paper: self.20260507100837.028	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507100837.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 10:09	Success	-	View
exp_self.20260507100104.027_20260507_100104 Paper: self.20260507100104.027	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507100104.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 10:02	Success	-	View
exp_self.20260507095323.026_20260507_095323 Paper: self.20260507095323.026	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507095323.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 09:54	Success	-	View
exp_self.20260507094547.025_20260507_094547 Paper: self.20260507094547.025	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507094547.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 09:46	Success	-	View
exp_pytrain.20260507094318.007_20260507_094318 Paper: pytrain.20260507094318.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 09:44	Success	-	View
exp_self.20260507093840.024_20260507_093841 Paper: self.20260507093840.024	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507093840.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 09:39	Success	-	View
exp_hf_2604.27393_20260507_093538 Paper: hf_2604.27393	MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction Paper ID: hf_2604.27393 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 09:36	Success	-	View
exp_self.20260507092937.023_20260507_092937 Paper: self.20260507092937.023	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507092937.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 09:30	Success	-	View
exp_self.20260507092143.022_20260507_092143 Paper: self.20260507092143.022	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507092143.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 09:22	Success	-	View
exp_self.20260507091349.021_20260507_091349 Paper: self.20260507091349.021	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507091349.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 09:14	Success	-	View
exp_pytrain.20260507091119.006_20260507_091119 Paper: pytrain.20260507091119.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 09:12	Success	-	View
exp_self.20260507090536.020_20260507_090536 Paper: self.20260507090536.020	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507090536.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 09:06	Success	-	View
exp_self.20260507085755.019_20260507_085756 Paper: self.20260507085755.019	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507085755.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 08:58	Success	-	View
exp_self.20260507085011.018_20260507_085011 Paper: self.20260507085011.018	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507085011.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 08:51	Success	-	View
exp_self.20260507084228.017_20260507_084228 Paper: self.20260507084228.017	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507084228.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 08:43	Success	-	View
exp_pytrain.20260507083958.005_20260507_083959 Paper: pytrain.20260507083958.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 08:41	Success	-	View
exp_self.20260507083358.016_20260507_083358 Paper: self.20260507083358.016	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507083358.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 08:35	Success	-	View
exp_self.20260507082612.015_20260507_082612 Paper: self.20260507082612.015	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507082612.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 08:27	Success	-	View
exp_self.20260507081830.014_20260507_081830 Paper: self.20260507081830.014	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507081830.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 08:19	Success	-	View
exp_self.20260507081052.013_20260507_081052 Paper: self.20260507081052.013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507081052.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 08:11	Success	-	View
exp_pytrain.20260507080816.004_20260507_080816 Paper: pytrain.20260507080816.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 08:09	Success	-	View
exp_self.20260507080105.012_20260507_080105 Paper: self.20260507080105.012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507080105.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 08:02	Success	-	View
exp_self.20260507075326.011_20260507_075326 Paper: self.20260507075326.011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507075326.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 07:54	Success	-	View
exp_self.20260507074550.010_20260507_074551 Paper: self.20260507074550.010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507074550.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 07:46	Success	-	View
exp_self.20260507073818.009_20260507_073819 Paper: self.20260507073818.009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507073818.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 07:39	Success	-	View
exp_pytrain.20260507073544.003_20260507_073545 Paper: pytrain.20260507073544.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 07:36	Success	-	View
exp_self.20260507073128.008_20260507_073129 Paper: self.20260507073128.008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507073128.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 07:32	Success	-	View
exp_self.20260507072351.007_20260507_072351 Paper: self.20260507072351.007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507072351.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 07:24	Success	-	View
exp_self.20260507071610.006_20260507_071610 Paper: self.20260507071610.006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507071610.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 07:17	Success	-	View
exp_cr_10.3390_app16104584_20260507_071322 Paper: cr_10.3390_app16104584	Assessing Stand-to-Sit Kinematics via mmWave Radar: A Real-to-Sim Robust Bidirectional State-Space Model Paper ID: cr_10.3390_app16104584 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	05-07 07:14	Success	-	View
exp_self.20260507070604.005_20260507_070604 Paper: self.20260507070604.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507070604.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 07:07	Success	-	View
exp_pytrain.20260507070329.002_20260507_070329 Paper: pytrain.20260507070329.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 07:04	Success	-	View
exp_self.20260507065759.004_20260507_065759 Paper: self.20260507065759.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507065759.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 06:59	Success	-	View
exp_self.20260507064938.003_20260507_064938 Paper: self.20260507064938.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507064938.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 06:50	Success	-	View
exp_self.20260507064159.002_20260507_064200 Paper: self.20260507064159.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507064159.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 06:43	Success	-	View
exp_self.20260507063425.001_20260507_063426 Paper: self.20260507063425.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507063425.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 06:35	Success	-	View
exp_pytrain.20260507063157.001_20260507_063157 Paper: pytrain.20260507063157.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 06:32	Success	-	View
exp_self.20260507062415.1506_20260507_062415 Paper: self.20260507062415.1506	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507062415.1506 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 06:25	Success	-	View
exp_self.20260507061643.1505_20260507_061643 Paper: self.20260507061643.1505	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507061643.1505 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 06:17	Success	-	View
exp_self.20260507060906.1504_20260507_060906 Paper: self.20260507060906.1504	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507060906.1504 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 06:10	Success	-	View
exp_pytrain.20260507060628.375_20260507_060628 Paper: pytrain.20260507060628.375	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 06:07	Success	-	View
exp_self.20260507055929.1503_20260507_055930 Paper: self.20260507055929.1503	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507055929.1503 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 06:00	Success	-	View
exp_self.20260507055148.1502_20260507_055149 Paper: self.20260507055148.1502	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507055148.1502 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 05:52	Success	-	View
exp_self.20260507054416.1501_20260507_054416 Paper: self.20260507054416.1501	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507054416.1501 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 05:45	Success	-	View
exp_self.20260507053723.1500_20260507_053724 Paper: self.20260507053723.1500	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507053723.1500 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 05:38	Success	-	View
exp_pytrain.20260507053455.374_20260507_053455 Paper: pytrain.20260507053455.374	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 05:35	Success	-	View
exp_self.20260507052858.1499_20260507_052858 Paper: self.20260507052858.1499	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507052858.1499 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 05:30	Success	-	View
exp_self.20260507052121.1498_20260507_052121 Paper: self.20260507052121.1498	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507052121.1498 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 05:22	Success	-	View
exp_self.20260507051336.1497_20260507_051336 Paper: self.20260507051336.1497	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507051336.1497 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 05:14	Success	-	View
exp_self.20260507050602.1496_20260507_050603 Paper: self.20260507050602.1496	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507050602.1496 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 05:07	Success	-	View
exp_pytrain.20260507050326.373_20260507_050327 Paper: pytrain.20260507050326.373	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 05:04	Success	-	View
exp_self.20260507045756.1495_20260507_045756 Paper: self.20260507045756.1495	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507045756.1495 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 04:58	Success	-	View
exp_hf_2605.03314_20260507_045431 Paper: hf_2605.03314	When to Think, When to Speak: Learning Disclosure Policies for LLM Reasoning Paper ID: hf_2605.03314 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 04:55	Success	-	View
exp_self.20260507045009.1494_20260507_045009 Paper: self.20260507045009.1494	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507045009.1494 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 04:51	Success	-	View
exp_self.20260507044236.1493_20260507_044236 Paper: self.20260507044236.1493	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507044236.1493 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 04:43	Success	-	View
exp_self.20260507043436.1492_20260507_043437 Paper: self.20260507043436.1492	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507043436.1492 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 04:35	Success	-	View
exp_pytrain.20260507043207.372_20260507_043207 Paper: pytrain.20260507043207.372	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 04:33	Success	-	View
exp_self.20260507042608.1491_20260507_042608 Paper: self.20260507042608.1491	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507042608.1491 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 04:27	Success	-	View
exp_self.20260507041915.1490_20260507_041916 Paper: self.20260507041915.1490	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507041915.1490 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 04:20	Success	-	View
exp_self.20260507041025.1489_20260507_041025 Paper: self.20260507041025.1489	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507041025.1489 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 04:11	Success	-	View
exp_self.20260507040248.1488_20260507_040249 Paper: self.20260507040248.1488	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507040248.1488 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 04:03	Success	-	View
exp_pytrain.20260507040012.371_20260507_040012 Paper: pytrain.20260507040012.371	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 04:01	Success	-	View
exp_self.20260507035307.1487_20260507_035307 Paper: self.20260507035307.1487	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507035307.1487 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 03:54	Success	-	View
exp_self.20260507034526.1486_20260507_034526 Paper: self.20260507034526.1486	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507034526.1486 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 03:46	Success	-	View
exp_self.20260507033747.1485_20260507_033747 Paper: self.20260507033747.1485	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507033747.1485 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 03:38	Success	-	View
exp_self.20260507033008.1484_20260507_033009 Paper: self.20260507033008.1484	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507033008.1484 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 03:31	Success	-	View
exp_pytrain.20260507032735.370_20260507_032735 Paper: pytrain.20260507032735.370	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 03:28	Success	-	View
exp_self.20260507032139.1483_20260507_032139 Paper: self.20260507032139.1483	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507032139.1483 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 03:22	Success	-	View
exp_self.20260507031406.1482_20260507_031406 Paper: self.20260507031406.1482	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507031406.1482 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 03:15	Success	-	View
exp_self.20260507030556.1481_20260507_030557 Paper: self.20260507030556.1481	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507030556.1481 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 03:06	Success	-	View
exp_self.20260507025817.1480_20260507_025818 Paper: self.20260507025817.1480	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507025817.1480 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 02:59	Success	-	View
exp_pytrain.20260507025542.369_20260507_025542 Paper: pytrain.20260507025542.369	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 02:56	Success	-	View
exp_self.20260507024939.1479_20260507_024940 Paper: self.20260507024939.1479	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507024939.1479 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 02:50	Success	-	View
exp_self.20260507024205.1478_20260507_024205 Paper: self.20260507024205.1478	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507024205.1478 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 02:43	Success	-	View
exp_self.20260507023432.1477_20260507_023432 Paper: self.20260507023432.1477	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507023432.1477 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 02:35	Success	-	View
exp_self.20260507022658.1476_20260507_022658 Paper: self.20260507022658.1476	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507022658.1476 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 02:28	Success	-	View
exp_pytrain.20260507022422.368_20260507_022422 Paper: pytrain.20260507022422.368	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 02:25	Success	-	View
exp_self.20260507021823.1475_20260507_021823 Paper: self.20260507021823.1475	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507021823.1475 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 02:19	Success	-	View
exp_self.20260507021030.1474_20260507_021030 Paper: self.20260507021030.1474	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507021030.1474 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 02:11	Success	-	View
exp_self.20260507020252.1473_20260507_020253 Paper: self.20260507020252.1473	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507020252.1473 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 02:03	Success	-	View
exp_self.20260507015523.1472_20260507_015524 Paper: self.20260507015523.1472	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507015523.1472 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 01:56	Success	-	View
exp_pytrain.20260507015255.367_20260507_015255 Paper: pytrain.20260507015255.367	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 01:53	Success	-	View
exp_self.20260507014550.1471_20260507_014551 Paper: self.20260507014550.1471	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507014550.1471 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 01:46	Success	-	View
exp_self.20260507013813.1470_20260507_013813 Paper: self.20260507013813.1470	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507013813.1470 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 01:39	Success	-	View
exp_self.20260507013036.1469_20260507_013036 Paper: self.20260507013036.1469	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507013036.1469 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 01:31	Success	-	View
exp_self.20260507012258.1468_20260507_012258 Paper: self.20260507012258.1468	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507012258.1468 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 01:24	Success	-	View
exp_pytrain.20260507012030.366_20260507_012031 Paper: pytrain.20260507012030.366	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 01:21	Success	-	View
exp_self.20260507011325.1467_20260507_011326 Paper: self.20260507011325.1467	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507011325.1467 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 01:14	Success	-	View
exp_self.20260507010550.1466_20260507_010551 Paper: self.20260507010550.1466	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507010550.1466 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 01:06	Success	-	View
exp_self.20260507005850.1465_20260507_005850 Paper: self.20260507005850.1465	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507005850.1465 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 00:59	Success	-	View
exp_self.20260507005118.1464_20260507_005118 Paper: self.20260507005118.1464	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507005118.1464 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 00:52	Success	-	View
exp_pytrain.20260507004842.365_20260507_004842 Paper: pytrain.20260507004842.365	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 00:49	Success	-	View
exp_self.20260507004140.1463_20260507_004140 Paper: self.20260507004140.1463	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507004140.1463 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 00:42	Success	-	View
exp_self.20260507003358.1462_20260507_003358 Paper: self.20260507003358.1462	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507003358.1462 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 00:35	Success	-	View
exp_self.20260507002629.1461_20260507_002629 Paper: self.20260507002629.1461	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507002629.1461 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 00:27	Success	-	View
exp_self.20260507001858.1460_20260507_001859 Paper: self.20260507001858.1460	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507001858.1460 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 00:20	Success	-	View
exp_pytrain.20260507001624.364_20260507_001625 Paper: pytrain.20260507001624.364	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-07 00:17	Success	-	View
exp_hf_2605.05185_20260507_001338 Paper: hf_2605.05185	OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper ID: hf_2605.05185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-07 00:14	Success	-	View
exp_self.20260507001024.1459_20260507_001024 Paper: self.20260507001024.1459	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507001024.1459 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 00:11	Success	-	View
exp_self.20260507000243.1458_20260507_000243 Paper: self.20260507000243.1458	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260507000243.1458 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-07 00:03	Success	-	View
exp_self.20260506235508.1457_20260506_235508 Paper: self.20260506235508.1457	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506235508.1457 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 23:56	Success	-	View
exp_self.20260506234738.1456_20260506_234739 Paper: self.20260506234738.1456	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506234738.1456 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 23:48	Success	-	View
exp_pytrain.20260506234503.363_20260506_234503 Paper: pytrain.20260506234503.363	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 23:46	Success	-	View
exp_self.20260506233940.1455_20260506_233940 Paper: self.20260506233940.1455	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506233940.1455 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 23:40	Success	-	View
exp_self.20260506233208.1454_20260506_233208 Paper: self.20260506233208.1454	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506233208.1454 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 23:33	Success	-	View
exp_hf_2605.03849_20260506_232848 Paper: hf_2605.03849	Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation Paper ID: hf_2605.03849 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-06 23:29	Success	-	View
exp_self.20260506232315.1453_20260506_232315 Paper: self.20260506232315.1453	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506232315.1453 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 23:24	Success	-	View
exp_hf_2605.04569_20260506_231736 Paper: hf_2605.04569	Lightning Unified Video Editing via In-Context Sparse Attention Paper ID: hf_2605.04569 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-06 23:18	Success	-	View
exp_self.20260506231531.1452_20260506_231531 Paper: self.20260506231531.1452	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506231531.1452 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 23:16	Success	-	View
exp_pytrain.20260506231255.362_20260506_231255 Paper: pytrain.20260506231255.362	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 23:13	Success	-	View
exp_self.20260506230837.1451_20260506_230837 Paper: self.20260506230837.1451	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506230837.1451 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 23:09	Success	-	View
exp_hf_2605.03269_20260506_230544 Paper: hf_2605.03269	RLDX-1 Technical Report Paper ID: hf_2605.03269 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-06 23:06	Success	-	View
exp_self.20260506225816.1450_20260506_225816 Paper: self.20260506225816.1450	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506225816.1450 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 22:59	Success	-	View
exp_self.20260506225047.1449_20260506_225047 Paper: self.20260506225047.1449	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506225047.1449 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 22:51	Success	-	View
exp_self.20260506224318.1448_20260506_224318 Paper: self.20260506224318.1448	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506224318.1448 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 22:44	Success	-	View
exp_pytrain.20260506224043.361_20260506_224043 Paper: pytrain.20260506224043.361	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 22:41	Success	-	View
exp_self.20260506223626.1447_20260506_223626 Paper: self.20260506223626.1447	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506223626.1447 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 22:37	Success	-	View
exp_self.20260506222850.1446_20260506_222850 Paper: self.20260506222850.1446	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506222850.1446 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 22:29	Success	-	View
exp_self.20260506222117.1445_20260506_222117 Paper: self.20260506222117.1445	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506222117.1445 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 22:22	Success	-	View
exp_self.20260506221340.1444_20260506_221340 Paper: self.20260506221340.1444	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506221340.1444 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 22:14	Success	-	View
exp_pytrain.20260506220829.360_20260506_220829 Paper: pytrain.20260506220829.360	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 22:09	Success	-	View
exp_self.20260506220602.1443_20260506_220602 Paper: self.20260506220602.1443	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506220602.1443 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 22:07	Success	-	View
exp_2605.05204v1_20260506_220233 Paper: 2605.05204v1	D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models Paper ID: 2605.05204v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-06 22:03	Success	-	View
exp_self.20260506215754.1442_20260506_215754 Paper: self.20260506215754.1442	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506215754.1442 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 21:58	Success	-	View
exp_self.20260506214951.1441_20260506_214951 Paper: self.20260506214951.1441	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506214951.1441 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 21:50	Success	-	View
exp_self.20260506214156.1440_20260506_214156 Paper: self.20260506214156.1440	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506214156.1440 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 21:42	Success	-	View
exp_pytrain.20260506213650.359_20260506_213650 Paper: pytrain.20260506213650.359	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 21:37	Success	-	View
exp_self.20260506213427.1439_20260506_213427 Paper: self.20260506213427.1439	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506213427.1439 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 21:35	Success	-	View
exp_self.20260506212628.1438_20260506_212628 Paper: self.20260506212628.1438	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506212628.1438 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 21:27	Success	-	View
exp_hf_2605.05204_20260506_212255 Paper: hf_2605.05204	D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models Paper ID: hf_2605.05204 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-06 21:23	Success	-	View
exp_self.20260506211706.1437_20260506_211706 Paper: self.20260506211706.1437	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506211706.1437 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 21:18	Success	-	View
exp_2605.05090v1_20260506_211401 Paper: 2605.05090v1	Automatically Finding and Validating Unexpected Side-Effects of Interventions on Language Models Paper ID: 2605.05090v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-06 21:15	Success	-	View
exp_self.20260506210742.1436_20260506_210743 Paper: self.20260506210742.1436	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506210742.1436 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 21:08	Success	-	View
exp_pytrain.20260506210454.358_20260506_210455 Paper: pytrain.20260506210454.358	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 21:05	Success	-	View
exp_self.20260506210014.1435_20260506_210015 Paper: self.20260506210014.1435	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506210014.1435 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 21:01	Success	-	View
exp_2605.05096v1_20260506_205535 Paper: 2605.05096v1	CapsID: Soft-Routed Variable-Length Semantic IDs for Generative Recommendation Paper ID: 2605.05096v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-06 20:56	Success	-	View
exp_self.20260506205310.1434_20260506_205310 Paper: self.20260506205310.1434	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506205310.1434 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 20:54	Success	-	View
exp_self.20260506204505.1433_20260506_204505 Paper: self.20260506204505.1433	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506204505.1433 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 20:46	Success	-	View
exp_self.20260506203708.1432_20260506_203709 Paper: self.20260506203708.1432	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506203708.1432 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 20:38	Success	-	View
exp_pytrain.20260506203307.357_20260506_203308 Paper: pytrain.20260506203307.357	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 20:34	Success	-	View
exp_self.20260506202939.1431_20260506_202940 Paper: self.20260506202939.1431	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506202939.1431 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 20:30	Success	-	View
exp_self.20260506202142.1430_20260506_202142 Paper: self.20260506202142.1430	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506202142.1430 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 20:22	Success	-	View
exp_self.20260506201336.1429_20260506_201337 Paper: self.20260506201336.1429	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506201336.1429 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 20:14	Success	-	View
exp_gh_is-leeroy-jenkins_Buddy_20260506_201034 Paper: gh_is-leeroy-jenkins_Buddy	is-leeroy-jenkins/Buddy Paper ID: gh_is-leeroy-jenkins_Buddy - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recover...	05-06 20:11	Success	-	View
exp_self.20260506200411.1428_20260506_200411 Paper: self.20260506200411.1428	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506200411.1428 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 20:05	Success	-	View
exp_pytrain.20260506200124.356_20260506_200124 Paper: pytrain.20260506200124.356	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 20:02	Success	-	View
exp_self.20260506195510.1427_20260506_195510 Paper: self.20260506195510.1427	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506195510.1427 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 19:56	Success	-	View
exp_gh_ThoughtTimeMachine_UFCE-Streaming_20260506_195143 Paper: gh_ThoughtTimeMachine_UFCE-Streaming	ThoughtTimeMachine/UFCE-Streaming Paper ID: gh_ThoughtTimeMachine_UFCE-Streaming - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signa...	05-06 19:52	Success	-	View
exp_self.20260506194802.1426_20260506_194802 Paper: self.20260506194802.1426	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506194802.1426 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 19:49	Success	-	View
exp_self.20260506193957.1425_20260506_193958 Paper: self.20260506193957.1425	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506193957.1425 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 19:41	Success	-	View
exp_self.20260506193204.1424_20260506_193204 Paper: self.20260506193204.1424	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506193204.1424 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 19:33	Success	-	View
exp_pytrain.20260506192900.355_20260506_192901 Paper: pytrain.20260506192900.355	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 19:30	Success	-	View
exp_self.20260506192424.1423_20260506_192424 Paper: self.20260506192424.1423	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506192424.1423 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 19:25	Success	-	View
exp_self.20260506191633.1422_20260506_191633 Paper: self.20260506191633.1422	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506191633.1422 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 19:17	Success	-	View
exp_gh_deepspeedai_DeepSpeed_20260506_191307 Paper: gh_deepspeedai_DeepSpeed	deepspeedai/DeepSpeed Paper ID: gh_deepspeedai_DeepSpeed - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 19:14	Success	-	View
exp_self.20260506190822.1421_20260506_190822 Paper: self.20260506190822.1421	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506190822.1421 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 19:09	Success	-	View
exp_gh_im-anishraj_BhojRAG_20260506_190456 Paper: gh_im-anishraj_BhojRAG	im-anishraj/BhojRAG Paper ID: gh_im-anishraj_BhojRAG - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	05-06 19:05	Success	-	View
exp_self.20260506190012.1420_20260506_190012 Paper: self.20260506190012.1420	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506190012.1420 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 19:01	Success	-	View
exp_pytrain.20260506185721.354_20260506_185721 Paper: pytrain.20260506185721.354	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 18:58	Success	-	View
exp_self.20260506185137.1419_20260506_185137 Paper: self.20260506185137.1419	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506185137.1419 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 18:52	Success	-	View
exp_self.20260506184341.1418_20260506_184342 Paper: self.20260506184341.1418	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506184341.1418 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 18:44	Success	-	View
exp_self.20260506183551.1417_20260506_183552 Paper: self.20260506183551.1417	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506183551.1417 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 18:36	Success	-	View
exp_self.20260506182801.1416_20260506_182801 Paper: self.20260506182801.1416	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506182801.1416 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 18:29	Success	-	View
exp_pytrain.20260506182511.353_20260506_182511 Paper: pytrain.20260506182511.353	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 18:26	Success	-	View
exp_self.20260506181924.1415_20260506_181924 Paper: self.20260506181924.1415	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506181924.1415 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 18:20	Success	-	View
exp_self.20260506181129.1414_20260506_181129 Paper: self.20260506181129.1414	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506181129.1414 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 18:12	Success	-	View
exp_self.20260506180333.1413_20260506_180334 Paper: self.20260506180333.1413	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506180333.1413 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 18:04	Success	-	View
exp_self.20260506175541.1412_20260506_175542 Paper: self.20260506175541.1412	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506175541.1412 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 17:56	Success	-	View
exp_pytrain.20260506175246.352_20260506_175246 Paper: pytrain.20260506175246.352	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 17:53	Success	-	View
exp_self.20260506174702.1411_20260506_174702 Paper: self.20260506174702.1411	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506174702.1411 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 17:48	Success	-	View
exp_self.20260506173906.1410_20260506_173907 Paper: self.20260506173906.1410	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506173906.1410 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 17:40	Success	-	View
exp_self.20260506173111.1409_20260506_173112 Paper: self.20260506173111.1409	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506173111.1409 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 17:32	Success	-	View
exp_self.20260506172315.1408_20260506_172315 Paper: self.20260506172315.1408	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506172315.1408 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 17:24	Success	-	View
exp_pytrain.20260506172028.351_20260506_172028 Paper: pytrain.20260506172028.351	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 17:21	Success	-	View
exp_self.20260506171414.1407_20260506_171415 Paper: self.20260506171414.1407	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506171414.1407 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 17:15	Success	-	View
exp_self.20260506170619.1406_20260506_170620 Paper: self.20260506170619.1406	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506170619.1406 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 17:07	Success	-	View
exp_self.20260506165827.1405_20260506_165828 Paper: self.20260506165827.1405	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506165827.1405 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 16:59	Success	-	View
exp_self.20260506165035.1404_20260506_165035 Paper: self.20260506165035.1404	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506165035.1404 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 16:51	Success	-	View
exp_pytrain.20260506164738.350_20260506_164738 Paper: pytrain.20260506164738.350	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 16:48	Success	-	View
exp_self.20260506164157.1403_20260506_164157 Paper: self.20260506164157.1403	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506164157.1403 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 16:42	Success	-	View
exp_self.20260506163402.1402_20260506_163402 Paper: self.20260506163402.1402	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506163402.1402 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 16:35	Success	-	View
exp_self.20260506162603.1401_20260506_162603 Paper: self.20260506162603.1401	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506162603.1401 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 16:27	Success	-	View
exp_self.20260506161806.1400_20260506_161806 Paper: self.20260506161806.1400	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506161806.1400 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 16:19	Success	-	View
exp_pytrain.20260506161515.349_20260506_161516 Paper: pytrain.20260506161515.349	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 16:16	Success	-	View
exp_self.20260506161037.1399_20260506_161037 Paper: self.20260506161037.1399	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506161037.1399 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 16:11	Success	-	View
exp_self.20260506160242.1398_20260506_160242 Paper: self.20260506160242.1398	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506160242.1398 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 16:03	Success	-	View
exp_self.20260506155444.1397_20260506_155444 Paper: self.20260506155444.1397	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506155444.1397 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 15:55	Success	-	View
exp_self.20260506154638.1396_20260506_154638 Paper: self.20260506154638.1396	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506154638.1396 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 15:47	Success	-	View
exp_pytrain.20260506154339.348_20260506_154339 Paper: pytrain.20260506154339.348	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 15:44	Success	-	View
exp_self.20260506153554.1395_20260506_153555 Paper: self.20260506153554.1395	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506153554.1395 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 15:36	Success	-	View
exp_self.20260506152858.1394_20260506_152858 Paper: self.20260506152858.1394	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506152858.1394 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 15:30	Success	-	View
exp_self.20260506152201.1393_20260506_152201 Paper: self.20260506152201.1393	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506152201.1393 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 15:23	Success	-	View
exp_self.20260506151346.1392_20260506_151347 Paper: self.20260506151346.1392	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506151346.1392 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 15:14	Success	-	View
exp_pytrain.20260506151043.347_20260506_151044 Paper: pytrain.20260506151043.347	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 15:11	Success	-	View
exp_self.20260506150548.1391_20260506_150549 Paper: self.20260506150548.1391	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506150548.1391 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 15:06	Success	-	View
exp_self.20260506145736.1390_20260506_145737 Paper: self.20260506145736.1390	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506145736.1390 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 14:58	Success	-	View
exp_self.20260506144917.1389_20260506_144918 Paper: self.20260506144917.1389	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506144917.1389 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 14:50	Success	-	View
exp_self.20260506144118.1388_20260506_144118 Paper: self.20260506144118.1388	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506144118.1388 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 14:42	Success	-	View
exp_pytrain.20260506143829.346_20260506_143830 Paper: pytrain.20260506143829.346	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 14:39	Success	-	View
exp_self.20260506143343.1387_20260506_143343 Paper: self.20260506143343.1387	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506143343.1387 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 14:34	Success	-	View
exp_self.20260506142557.1386_20260506_142558 Paper: self.20260506142557.1386	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506142557.1386 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 14:27	Success	-	View
exp_self.20260506141756.1385_20260506_141757 Paper: self.20260506141756.1385	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506141756.1385 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 14:19	Success	-	View
exp_self.20260506140953.1384_20260506_140954 Paper: self.20260506140953.1384	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506140953.1384 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 14:10	Success	-	View
exp_pytrain.20260506140648.345_20260506_140649 Paper: pytrain.20260506140648.345	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 14:07	Success	-	View
exp_self.20260506140202.1383_20260506_140202 Paper: self.20260506140202.1383	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506140202.1383 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 14:03	Success	-	View
exp_self.20260506135352.1382_20260506_135352 Paper: self.20260506135352.1382	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506135352.1382 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 13:54	Success	-	View
exp_self.20260506134544.1381_20260506_134545 Paper: self.20260506134544.1381	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506134544.1381 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 13:46	Success	-	View
exp_self.20260506133740.1380_20260506_133740 Paper: self.20260506133740.1380	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506133740.1380 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 13:38	Success	-	View
exp_pytrain.20260506133436.344_20260506_133436 Paper: pytrain.20260506133436.344	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 13:35	Success	-	View
exp_self.20260506132951.1379_20260506_132951 Paper: self.20260506132951.1379	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506132951.1379 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 13:30	Success	-	View
exp_self.20260506132146.1378_20260506_132147 Paper: self.20260506132146.1378	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506132146.1378 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 13:22	Success	-	View
exp_self.20260506131341.1377_20260506_131341 Paper: self.20260506131341.1377	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506131341.1377 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 13:14	Success	-	View
exp_self.20260506130534.1376_20260506_130535 Paper: self.20260506130534.1376	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506130534.1376 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 13:06	Success	-	View
exp_pytrain.20260506130229.343_20260506_130229 Paper: pytrain.20260506130229.343	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 13:03	Success	-	View
exp_self.20260506125636.1375_20260506_125636 Paper: self.20260506125636.1375	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506125636.1375 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 12:57	Success	-	View
exp_self.20260506124942.1374_20260506_124942 Paper: self.20260506124942.1374	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506124942.1374 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 12:50	Success	-	View
exp_self.20260506124134.1373_20260506_124134 Paper: self.20260506124134.1373	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506124134.1373 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 12:42	Success	-	View
exp_self.20260506123325.1372_20260506_123326 Paper: self.20260506123325.1372	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506123325.1372 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 12:34	Success	-	View
exp_pytrain.20260506123020.342_20260506_123021 Paper: pytrain.20260506123020.342	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 12:31	Success	-	View
exp_hf_2605.02913_20260506_122724 Paper: hf_2605.02913	Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning Paper ID: hf_2605.02913 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-06 12:28	Success	-	View
exp_self.20260506122243.1371_20260506_122244 Paper: self.20260506122243.1371	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506122243.1371 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 12:23	Success	-	View
exp_self.20260506121448.1370_20260506_121449 Paper: self.20260506121448.1370	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506121448.1370 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 12:15	Success	-	View
exp_self.20260506120656.1369_20260506_120656 Paper: self.20260506120656.1369	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506120656.1369 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 12:07	Success	-	View
exp_self.20260506115901.1368_20260506_115901 Paper: self.20260506115901.1368	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506115901.1368 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 12:00	Success	-	View
exp_pytrain.20260506115614.341_20260506_115615 Paper: pytrain.20260506115614.341	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 11:57	Success	-	View
exp_self.20260506114901.1367_20260506_114901 Paper: self.20260506114901.1367	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506114901.1367 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 11:50	Success	-	View
exp_self.20260506114120.1366_20260506_114120 Paper: self.20260506114120.1366	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506114120.1366 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 11:42	Success	-	View
exp_self.20260506113344.1365_20260506_113344 Paper: self.20260506113344.1365	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506113344.1365 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 11:34	Success	-	View
exp_self.20260506112621.1364_20260506_112622 Paper: self.20260506112621.1364	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506112621.1364 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 11:27	Success	-	View
exp_pytrain.20260506112352.340_20260506_112352 Paper: pytrain.20260506112352.340	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 11:24	Success	-	View
exp_self.20260506111730.1363_20260506_111730 Paper: self.20260506111730.1363	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506111730.1363 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 11:18	Success	-	View
exp_self.20260506110953.1362_20260506_110954 Paper: self.20260506110953.1362	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506110953.1362 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 11:10	Success	-	View
exp_self.20260506110211.1361_20260506_110212 Paper: self.20260506110211.1361	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506110211.1361 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 11:03	Success	-	View
exp_self.20260506105422.1360_20260506_105422 Paper: self.20260506105422.1360	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506105422.1360 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 10:55	Success	-	View
exp_pytrain.20260506105145.339_20260506_105146 Paper: pytrain.20260506105145.339	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 10:52	Success	-	View
exp_self.20260506104615.1359_20260506_104615 Paper: self.20260506104615.1359	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506104615.1359 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 10:47	Success	-	View
exp_self.20260506103815.1358_20260506_103816 Paper: self.20260506103815.1358	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506103815.1358 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 10:39	Success	-	View
exp_self.20260506103029.1357_20260506_103030 Paper: self.20260506103029.1357	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506103029.1357 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 10:31	Success	-	View
exp_self.20260506102252.1356_20260506_102253 Paper: self.20260506102252.1356	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506102252.1356 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 10:23	Success	-	View
exp_pytrain.20260506102024.338_20260506_102025 Paper: pytrain.20260506102024.338	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 10:21	Success	-	View
exp_self.20260506101312.1355_20260506_101312 Paper: self.20260506101312.1355	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506101312.1355 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 10:14	Success	-	View
exp_self.20260506100534.1354_20260506_100535 Paper: self.20260506100534.1354	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506100534.1354 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 10:06	Success	-	View
exp_self.20260506095751.1353_20260506_095752 Paper: self.20260506095751.1353	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506095751.1353 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 09:58	Success	-	View
exp_self.20260506095012.1352_20260506_095012 Paper: self.20260506095012.1352	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506095012.1352 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 09:51	Success	-	View
exp_pytrain.20260506094743.337_20260506_094743 Paper: pytrain.20260506094743.337	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 09:48	Success	-	View
exp_self.20260506094159.1351_20260506_094200 Paper: self.20260506094159.1351	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506094159.1351 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 09:43	Success	-	View
exp_self.20260506093422.1350_20260506_093422 Paper: self.20260506093422.1350	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506093422.1350 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 09:35	Success	-	View
exp_self.20260506092636.1349_20260506_092637 Paper: self.20260506092636.1349	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506092636.1349 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 09:27	Success	-	View
exp_self.20260506091853.1348_20260506_091853 Paper: self.20260506091853.1348	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506091853.1348 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 09:19	Success	-	View
exp_pytrain.20260506091624.336_20260506_091624 Paper: pytrain.20260506091624.336	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 09:17	Success	-	View
exp_self.20260506091054.1347_20260506_091054 Paper: self.20260506091054.1347	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506091054.1347 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 09:11	Success	-	View
exp_self.20260506090314.1346_20260506_090314 Paper: self.20260506090314.1346	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506090314.1346 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 09:04	Success	-	View
exp_self.20260506085533.1345_20260506_085533 Paper: self.20260506085533.1345	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506085533.1345 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 08:56	Success	-	View
exp_self.20260506084742.1344_20260506_084743 Paper: self.20260506084742.1344	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506084742.1344 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 08:48	Success	-	View
exp_pytrain.20260506084502.335_20260506_084502 Paper: pytrain.20260506084502.335	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 08:46	Success	-	View
exp_self.20260506083928.1343_20260506_083929 Paper: self.20260506083928.1343	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506083928.1343 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 08:40	Success	-	View
exp_self.20260506083132.1342_20260506_083133 Paper: self.20260506083132.1342	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506083132.1342 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 08:32	Success	-	View
exp_self.20260506082355.1341_20260506_082355 Paper: self.20260506082355.1341	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506082355.1341 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 08:24	Success	-	View
exp_self.20260506081616.1340_20260506_081616 Paper: self.20260506081616.1340	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506081616.1340 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 08:17	Success	-	View
exp_pytrain.20260506081342.334_20260506_081342 Paper: pytrain.20260506081342.334	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 08:14	Success	-	View
exp_self.20260506080639.1339_20260506_080639 Paper: self.20260506080639.1339	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506080639.1339 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 08:07	Success	-	View
exp_self.20260506075857.1338_20260506_075857 Paper: self.20260506075857.1338	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506075857.1338 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 07:59	Success	-	View
exp_self.20260506075116.1337_20260506_075116 Paper: self.20260506075116.1337	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506075116.1337 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 07:52	Success	-	View
exp_self.20260506074335.1336_20260506_074335 Paper: self.20260506074335.1336	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506074335.1336 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 07:44	Success	-	View
exp_pytrain.20260506074106.333_20260506_074106 Paper: pytrain.20260506074106.333	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 07:42	Success	-	View
exp_self.20260506073355.1335_20260506_073355 Paper: self.20260506073355.1335	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506073355.1335 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 07:34	Success	-	View
exp_self.20260506072620.1334_20260506_072620 Paper: self.20260506072620.1334	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506072620.1334 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 07:27	Success	-	View
exp_self.20260506071834.1333_20260506_071834 Paper: self.20260506071834.1333	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506071834.1333 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 07:19	Success	-	View
exp_self.20260506071049.1332_20260506_071050 Paper: self.20260506071049.1332	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506071049.1332 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 07:11	Success	-	View
exp_pytrain.20260506070822.332_20260506_070822 Paper: pytrain.20260506070822.332	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 07:09	Success	-	View
exp_self.20260506070111.1331_20260506_070111 Paper: self.20260506070111.1331	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506070111.1331 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 07:02	Success	-	View
exp_self.20260506065331.1330_20260506_065332 Paper: self.20260506065331.1330	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506065331.1330 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 06:54	Success	-	View
exp_self.20260506064551.1329_20260506_064551 Paper: self.20260506064551.1329	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506064551.1329 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 06:46	Success	-	View
exp_self.20260506063806.1328_20260506_063806 Paper: self.20260506063806.1328	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506063806.1328 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 06:39	Success	-	View
exp_pytrain.20260506063538.331_20260506_063538 Paper: pytrain.20260506063538.331	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 06:36	Success	-	View
exp_hf_2605.02904_20260506_063138 Paper: hf_2605.02904	StateSMix: Online Lossless Compression via Mamba State Space Models and Sparse N-gram Context Mixing Paper ID: hf_2605.02904 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-06 06:32	Success	-	View
exp_self.20260506062933.1327_20260506_062933 Paper: self.20260506062933.1327	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506062933.1327 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 06:30	Success	-	View
exp_self.20260506062154.1326_20260506_062155 Paper: self.20260506062154.1326	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506062154.1326 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 06:22	Success	-	View
exp_self.20260506061412.1325_20260506_061413 Paper: self.20260506061412.1325	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506061412.1325 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 06:15	Success	-	View
exp_self.20260506060632.1324_20260506_060632 Paper: self.20260506060632.1324	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506060632.1324 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 06:07	Success	-	View
exp_pytrain.20260506060404.330_20260506_060404 Paper: pytrain.20260506060404.330	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 06:05	Success	-	View
exp_self.20260506055656.1323_20260506_055657 Paper: self.20260506055656.1323	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506055656.1323 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 05:57	Success	-	View
exp_self.20260506054921.1322_20260506_054921 Paper: self.20260506054921.1322	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506054921.1322 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 05:50	Success	-	View
exp_self.20260506054146.1321_20260506_054146 Paper: self.20260506054146.1321	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506054146.1321 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 05:42	Success	-	View
exp_self.20260506053353.1320_20260506_053353 Paper: self.20260506053353.1320	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506053353.1320 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 05:34	Success	-	View
exp_pytrain.20260506053124.329_20260506_053124 Paper: pytrain.20260506053124.329	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 05:32	Success	-	View
exp_self.20260506052416.1319_20260506_052417 Paper: self.20260506052416.1319	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506052416.1319 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 05:25	Success	-	View
exp_self.20260506051643.1318_20260506_051643 Paper: self.20260506051643.1318	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506051643.1318 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 05:17	Success	-	View
exp_self.20260506050907.1317_20260506_050907 Paper: self.20260506050907.1317	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506050907.1317 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 05:10	Success	-	View
exp_self.20260506050122.1316_20260506_050123 Paper: self.20260506050122.1316	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506050122.1316 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 05:02	Success	-	View
exp_pytrain.20260506045850.328_20260506_045851 Paper: pytrain.20260506045850.328	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 04:59	Success	-	View
exp_self.20260506045253.1315_20260506_045254 Paper: self.20260506045253.1315	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506045253.1315 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 04:53	Success	-	View
exp_self.20260506044509.1314_20260506_044510 Paper: self.20260506044509.1314	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506044509.1314 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 04:46	Success	-	View
exp_self.20260506043724.1313_20260506_043724 Paper: self.20260506043724.1313	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506043724.1313 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 04:38	Success	-	View
exp_self.20260506042942.1312_20260506_042942 Paper: self.20260506042942.1312	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506042942.1312 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 04:30	Success	-	View
exp_pytrain.20260506042714.327_20260506_042714 Paper: pytrain.20260506042714.327	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 04:28	Success	-	View
exp_self.20260506042130.1311_20260506_042131 Paper: self.20260506042130.1311	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506042130.1311 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 04:22	Success	-	View
exp_self.20260506041350.1310_20260506_041351 Paper: self.20260506041350.1310	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506041350.1310 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 04:14	Success	-	View
exp_self.20260506040558.1309_20260506_040558 Paper: self.20260506040558.1309	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506040558.1309 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 04:07	Success	-	View
exp_self.20260506035818.1308_20260506_035818 Paper: self.20260506035818.1308	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506035818.1308 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 03:59	Success	-	View
exp_pytrain.20260506035550.326_20260506_035550 Paper: pytrain.20260506035550.326	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 03:56	Success	-	View
exp_self.20260506034944.1307_20260506_034945 Paper: self.20260506034944.1307	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506034944.1307 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 03:50	Success	-	View
exp_self.20260506034200.1306_20260506_034201 Paper: self.20260506034200.1306	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506034200.1306 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 03:43	Success	-	View
exp_self.20260506033417.1305_20260506_033417 Paper: self.20260506033417.1305	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506033417.1305 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 03:35	Success	-	View
exp_self.20260506032632.1304_20260506_032632 Paper: self.20260506032632.1304	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506032632.1304 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 03:27	Success	-	View
exp_pytrain.20260506032349.325_20260506_032349 Paper: pytrain.20260506032349.325	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 03:24	Success	-	View
exp_self.20260506031817.1303_20260506_031817 Paper: self.20260506031817.1303	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506031817.1303 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 03:19	Success	-	View
exp_self.20260506031109.1302_20260506_031109 Paper: self.20260506031109.1302	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506031109.1302 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 03:12	Success	-	View
exp_self.20260506030304.1301_20260506_030305 Paper: self.20260506030304.1301	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506030304.1301 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 03:04	Success	-	View
exp_self.20260506025538.1300_20260506_025539 Paper: self.20260506025538.1300	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506025538.1300 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 02:56	Success	-	View
exp_pytrain.20260506025209.324_20260506_025209 Paper: pytrain.20260506025209.324	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 02:53	Success	-	View
exp_self.20260506024648.1299_20260506_024648 Paper: self.20260506024648.1299	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506024648.1299 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 02:47	Success	-	View
exp_hf_2605.00891_20260506_024255 Paper: hf_2605.00891	X2SAM: Any Segmentation in Images and Videos Paper ID: hf_2605.00891 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-06 02:43	Success	-	View
exp_self.20260506023848.1298_20260506_023848 Paper: self.20260506023848.1298	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506023848.1298 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 02:39	Success	-	View
exp_self.20260506023121.1297_20260506_023121 Paper: self.20260506023121.1297	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506023121.1297 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 02:32	Success	-	View
exp_self.20260506022356.1296_20260506_022357 Paper: self.20260506022356.1296	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506022356.1296 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 02:25	Success	-	View
exp_pytrain.20260506022030.323_20260506_022030 Paper: pytrain.20260506022030.323	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 02:21	Success	-	View
exp_self.20260506021521.1295_20260506_021521 Paper: self.20260506021521.1295	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506021521.1295 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 02:16	Success	-	View
exp_self.20260506020756.1294_20260506_020756 Paper: self.20260506020756.1294	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506020756.1294 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 02:09	Success	-	View
exp_self.20260506015842.1293_20260506_015842 Paper: self.20260506015842.1293	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506015842.1293 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 01:59	Success	-	View
exp_self.20260506015121.1292_20260506_015121 Paper: self.20260506015121.1292	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506015121.1292 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 01:52	Success	-	View
exp_pytrain.20260506014801.322_20260506_014801 Paper: pytrain.20260506014801.322	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 01:49	Success	-	View
exp_hf_2605.01371_20260506_014330 Paper: hf_2605.01371	ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue Paper ID: hf_2605.01371 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-06 01:44	Success	-	View
exp_self.20260506014034.1291_20260506_014034 Paper: self.20260506014034.1291	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506014034.1291 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 01:41	Success	-	View
exp_self.20260506013314.1290_20260506_013315 Paper: self.20260506013314.1290	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506013314.1290 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 01:34	Success	-	View
exp_self.20260506012552.1289_20260506_012552 Paper: self.20260506012552.1289	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506012552.1289 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 01:26	Success	-	View
exp_self.20260506011836.1288_20260506_011836 Paper: self.20260506011836.1288	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506011836.1288 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 01:19	Success	-	View
exp_pytrain.20260506011505.321_20260506_011505 Paper: pytrain.20260506011505.321	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 01:16	Success	-	View
exp_self.20260506010825.1287_20260506_010825 Paper: self.20260506010825.1287	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506010825.1287 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 01:09	Success	-	View
exp_self.20260506010052.1286_20260506_010052 Paper: self.20260506010052.1286	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506010052.1286 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 01:01	Success	-	View
exp_self.20260506005328.1285_20260506_005328 Paper: self.20260506005328.1285	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506005328.1285 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 00:54	Success	-	View
exp_self.20260506004614.1284_20260506_004614 Paper: self.20260506004614.1284	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506004614.1284 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 00:47	Success	-	View
exp_pytrain.20260506004250.320_20260506_004250 Paper: pytrain.20260506004250.320	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 00:43	Success	-	View
exp_self.20260506003606.1283_20260506_003607 Paper: self.20260506003606.1283	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506003606.1283 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 00:37	Success	-	View
exp_self.20260506002849.1282_20260506_002849 Paper: self.20260506002849.1282	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506002849.1282 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 00:29	Success	-	View
exp_self.20260506002126.1281_20260506_002126 Paper: self.20260506002126.1281	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506002126.1281 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 00:22	Success	-	View
exp_self.20260506001414.1280_20260506_001414 Paper: self.20260506001414.1280	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506001414.1280 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 00:15	Success	-	View
exp_pytrain.20260506001053.319_20260506_001054 Paper: pytrain.20260506001053.319	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-06 00:11	Success	-	View
exp_self.20260506000411.1279_20260506_000412 Paper: self.20260506000411.1279	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260506000411.1279 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-06 00:05	Success	-	View
exp_self.20260505235640.1278_20260505_235640 Paper: self.20260505235640.1278	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505235640.1278 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 23:57	Success	-	View
exp_self.20260505234923.1277_20260505_234923 Paper: self.20260505234923.1277	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505234923.1277 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 23:50	Success	-	View
exp_self.20260505234203.1276_20260505_234203 Paper: self.20260505234203.1276	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505234203.1276 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 23:43	Success	-	View
exp_pytrain.20260505233843.318_20260505_233843 Paper: pytrain.20260505233843.318	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 23:39	Success	-	View
exp_self.20260505233157.1275_20260505_233157 Paper: self.20260505233157.1275	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505233157.1275 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 23:33	Success	-	View
exp_self.20260505232441.1274_20260505_232441 Paper: self.20260505232441.1274	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505232441.1274 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 23:25	Success	-	View
exp_cr_10.1093_ehjdh_ztag070_20260505_231942 Paper: cr_10.1093_ehjdh_ztag070	Automated Full-text screening and accelerated reviews using large language models with Context-Aware Agents: An explorat... Paper ID: cr_10.1093_ehjdh_ztag070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 23:20	Success	-	View
exp_self.20260505231641.1273_20260505_231641 Paper: self.20260505231641.1273	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505231641.1273 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 23:17	Success	-	View
exp_self.20260505230927.1272_20260505_230927 Paper: self.20260505230927.1272	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505230927.1272 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 23:10	Success	-	View
exp_pytrain.20260505230558.317_20260505_230558 Paper: pytrain.20260505230558.317	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 23:07	Success	-	View
exp_hf_2605.01284_20260505_230319 Paper: hf_2605.01284	Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation Paper ID: hf_2605.01284 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 23:04	Success	-	View
exp_self.20260505230025.1271_20260505_230025 Paper: self.20260505230025.1271	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505230025.1271 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 23:01	Success	-	View
exp_self.20260505225303.1270_20260505_225303 Paper: self.20260505225303.1270	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505225303.1270 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 22:54	Success	-	View
exp_self.20260505224545.1269_20260505_224546 Paper: self.20260505224545.1269	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505224545.1269 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 22:46	Success	-	View
exp_self.20260505223823.1268_20260505_223823 Paper: self.20260505223823.1268	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505223823.1268 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 22:39	Success	-	View
exp_pytrain.20260505223353.316_20260505_223353 Paper: pytrain.20260505223353.316	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 22:34	Success	-	View
exp_self.20260505223102.1267_20260505_223102 Paper: self.20260505223102.1267	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505223102.1267 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 22:32	Success	-	View
exp_2605.04040v1_20260505_222739 Paper: 2605.04040v1	Large Language Models are Universal Reasoners for Visual Generation Paper ID: 2605.04040v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-05 22:28	Success	-	View
exp_self.20260505222132.1266_20260505_222133 Paper: self.20260505222132.1266	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505222132.1266 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 22:22	Success	-	View
exp_hf_2605.01466_20260505_221705 Paper: hf_2605.01466	SplAttN: Bridging 2D and 3D with Gaussian Soft Splatting and Attention for Point Cloud Completion Paper ID: hf_2605.01466 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 22:18	Success	-	View
exp_self.20260505221348.1265_20260505_221349 Paper: self.20260505221348.1265	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505221348.1265 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 22:14	Success	-	View
exp_2605.04045v1_20260505_220959 Paper: 2605.04045v1	Audio-Visual Intelligence in Large Foundation Models Paper ID: 2605.04045v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-05 22:11	Success	-	View
exp_self.20260505220445.1264_20260505_220445 Paper: self.20260505220445.1264	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505220445.1264 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 22:05	Success	-	View
exp_pytrain.20260505220011.315_20260505_220011 Paper: pytrain.20260505220011.315	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 22:01	Success	-	View
exp_self.20260505215718.1263_20260505_215718 Paper: self.20260505215718.1263	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505215718.1263 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 21:58	Success	-	View
exp_hf_2604.28123_20260505_215348 Paper: hf_2604.28123	Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL Paper ID: hf_2604.28123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 21:54	Success	-	View
exp_gh_Deor736_casullens_20260505_215028 Paper: gh_Deor736_casullens	Deor736/casullens Paper ID: gh_Deor736_casullens - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered ben...	05-05 21:51	Success	-	View
exp_self.20260505214514.1262_20260505_214515 Paper: self.20260505214514.1262	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505214514.1262 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 21:46	Success	-	View
exp_self.20260505213759.1261_20260505_213759 Paper: self.20260505213759.1261	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505213759.1261 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 21:39	Success	-	View
exp_hf_2605.02943_20260505_213407 Paper: hf_2605.02943	Healthcare AI GYM for Medical Agents Paper ID: hf_2605.02943 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 21:35	Success	-	View
exp_self.20260505213001.1260_20260505_213001 Paper: self.20260505213001.1260	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505213001.1260 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 21:31	Success	-	View
exp_pytrain.20260505212638.314_20260505_212638 Paper: pytrain.20260505212638.314	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 21:27	Success	-	View
exp_self.20260505212240.1259_20260505_212240 Paper: self.20260505212240.1259	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505212240.1259 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 21:23	Success	-	View
exp_self.20260505211524.1258_20260505_211525 Paper: self.20260505211524.1258	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505211524.1258 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 21:16	Success	-	View
exp_2605.03969v1_20260505_211025 Paper: 2605.03969v1	Feature-Augmented Transformers for Robust AI-Text Detection Across Domains and Generators Paper ID: 2605.03969v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-05 21:11	Success	-	View
exp_self.20260505210725.1257_20260505_210726 Paper: self.20260505210725.1257	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505210725.1257 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 21:08	Success	-	View
exp_2605.03953v1_20260505_210336 Paper: 2605.03953v1	Transformers with Selective Access to Early Representations Paper ID: 2605.03953v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-05 21:04	Success	-	View
exp_self.20260505205824.1256_20260505_205824 Paper: self.20260505205824.1256	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505205824.1256 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 20:59	Success	-	View
exp_pytrain.20260505205456.313_20260505_205456 Paper: pytrain.20260505205456.313	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 20:56	Success	-	View
exp_self.20260505205058.1255_20260505_205058 Paper: self.20260505205058.1255	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505205058.1255 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 20:52	Success	-	View
exp_hf_2605.04012_20260505_204705 Paper: hf_2605.04012	SymptomAI: Towards a Conversational AI Agent for Everyday Symptom Assessment Paper ID: hf_2605.04012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 20:48	Success	-	View
exp_self.20260505204143.1254_20260505_204143 Paper: self.20260505204143.1254	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505204143.1254 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 20:42	Success	-	View
exp_self.20260505203329.1253_20260505_203329 Paper: self.20260505203329.1253	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505203329.1253 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 20:34	Success	-	View
exp_self.20260505202614.1252_20260505_202614 Paper: self.20260505202614.1252	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505202614.1252 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 20:27	Success	-	View
exp_pytrain.20260505202244.312_20260505_202245 Paper: pytrain.20260505202244.312	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 20:23	Success	-	View
exp_self.20260505201556.1251_20260505_201556 Paper: self.20260505201556.1251	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505201556.1251 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 20:17	Success	-	View
exp_self.20260505200835.1250_20260505_200835 Paper: self.20260505200835.1250	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505200835.1250 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 20:09	Success	-	View
exp_self.20260505200121.1249_20260505_200122 Paper: self.20260505200121.1249	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505200121.1249 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 20:02	Success	-	View
exp_self.20260505195407.1248_20260505_195407 Paper: self.20260505195407.1248	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505195407.1248 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 19:55	Success	-	View
exp_pytrain.20260505195036.311_20260505_195037 Paper: pytrain.20260505195036.311	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 19:51	Success	-	View
exp_self.20260505194354.1247_20260505_194354 Paper: self.20260505194354.1247	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505194354.1247 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 19:44	Success	-	View
exp_self.20260505193634.1246_20260505_193635 Paper: self.20260505193634.1246	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505193634.1246 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 19:37	Success	-	View
exp_self.20260505192915.1245_20260505_192916 Paper: self.20260505192915.1245	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505192915.1245 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 19:30	Success	-	View
exp_self.20260505192149.1244_20260505_192149 Paper: self.20260505192149.1244	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505192149.1244 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 19:22	Success	-	View
exp_pytrain.20260505191823.310_20260505_191823 Paper: pytrain.20260505191823.310	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 19:19	Success	-	View
exp_self.20260505191142.1243_20260505_191143 Paper: self.20260505191142.1243	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505191142.1243 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 19:12	Success	-	View
exp_self.20260505190417.1242_20260505_190417 Paper: self.20260505190417.1242	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505190417.1242 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 19:05	Success	-	View
exp_self.20260505185651.1241_20260505_185651 Paper: self.20260505185651.1241	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505185651.1241 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 18:57	Success	-	View
exp_self.20260505184926.1240_20260505_184926 Paper: self.20260505184926.1240	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505184926.1240 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 18:50	Success	-	View
exp_pytrain.20260505184606.309_20260505_184606 Paper: pytrain.20260505184606.309	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 18:47	Success	-	View
exp_self.20260505183935.1239_20260505_183936 Paper: self.20260505183935.1239	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505183935.1239 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 18:40	Success	-	View
exp_self.20260505183156.1238_20260505_183156 Paper: self.20260505183156.1238	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505183156.1238 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 18:32	Success	-	View
exp_self.20260505182422.1237_20260505_182422 Paper: self.20260505182422.1237	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505182422.1237 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 18:25	Success	-	View
exp_self.20260505181651.1236_20260505_181651 Paper: self.20260505181651.1236	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505181651.1236 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 18:17	Success	-	View
exp_pytrain.20260505181418.308_20260505_181419 Paper: pytrain.20260505181418.308	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 18:15	Success	-	View
exp_self.20260505180708.1235_20260505_180709 Paper: self.20260505180708.1235	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505180708.1235 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 18:08	Success	-	View
exp_self.20260505175935.1234_20260505_175935 Paper: self.20260505175935.1234	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505175935.1234 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 18:00	Success	-	View
exp_self.20260505175200.1233_20260505_175200 Paper: self.20260505175200.1233	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505175200.1233 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 17:53	Success	-	View
exp_self.20260505174431.1232_20260505_174431 Paper: self.20260505174431.1232	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505174431.1232 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 17:45	Success	-	View
exp_pytrain.20260505174203.307_20260505_174203 Paper: pytrain.20260505174203.307	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 17:43	Success	-	View
exp_self.20260505173745.1231_20260505_173745 Paper: self.20260505173745.1231	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505173745.1231 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 17:38	Success	-	View
exp_hf_2605.00925_20260505_173448 Paper: hf_2605.00925	Linking spatial biology and clinical histology via Haiku Paper ID: hf_2605.00925 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 17:35	Success	-	View
exp_self.20260505172739.1230_20260505_172740 Paper: self.20260505172739.1230	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505172739.1230 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 17:28	Success	-	View
exp_self.20260505172004.1229_20260505_172005 Paper: self.20260505172004.1229	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505172004.1229 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 17:21	Success	-	View
exp_self.20260505171230.1228_20260505_171231 Paper: self.20260505171230.1228	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505171230.1228 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 17:13	Success	-	View
exp_pytrain.20260505170948.306_20260505_170949 Paper: pytrain.20260505170948.306	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 17:10	Success	-	View
exp_self.20260505170305.1227_20260505_170306 Paper: self.20260505170305.1227	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505170305.1227 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 17:04	Success	-	View
exp_self.20260505165541.1226_20260505_165542 Paper: self.20260505165541.1226	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505165541.1226 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 16:56	Success	-	View
exp_self.20260505164819.1225_20260505_164819 Paper: self.20260505164819.1225	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505164819.1225 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 16:49	Success	-	View
exp_self.20260505164058.1224_20260505_164058 Paper: self.20260505164058.1224	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505164058.1224 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 16:42	Success	-	View
exp_pytrain.20260505163732.305_20260505_163733 Paper: pytrain.20260505163732.305	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 16:38	Success	-	View
exp_self.20260505163223.1223_20260505_163223 Paper: self.20260505163223.1223	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505163223.1223 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 16:33	Success	-	View
exp_self.20260505162502.1222_20260505_162502 Paper: self.20260505162502.1222	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505162502.1222 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 16:26	Success	-	View
exp_self.20260505161709.1221_20260505_161709 Paper: self.20260505161709.1221	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505161709.1221 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 16:18	Success	-	View
exp_self.20260505160839.1220_20260505_160840 Paper: self.20260505160839.1220	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505160839.1220 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 16:09	Success	-	View
exp_pytrain.20260505160515.304_20260505_160515 Paper: pytrain.20260505160515.304	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 16:06	Success	-	View
exp_self.20260505155833.1219_20260505_155833 Paper: self.20260505155833.1219	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505155833.1219 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 15:59	Success	-	View
exp_self.20260505155120.1218_20260505_155120 Paper: self.20260505155120.1218	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505155120.1218 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 15:52	Success	-	View
exp_self.20260505154406.1217_20260505_154407 Paper: self.20260505154406.1217	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505154406.1217 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 15:45	Success	-	View
exp_self.20260505153652.1216_20260505_153653 Paper: self.20260505153652.1216	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505153652.1216 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 15:37	Success	-	View
exp_pytrain.20260505153327.303_20260505_153327 Paper: pytrain.20260505153327.303	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 15:34	Success	-	View
exp_self.20260505152648.1215_20260505_152649 Paper: self.20260505152648.1215	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505152648.1215 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 15:27	Success	-	View
exp_self.20260505151932.1214_20260505_151932 Paper: self.20260505151932.1214	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505151932.1214 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 15:20	Success	-	View
exp_self.20260505151218.1213_20260505_151218 Paper: self.20260505151218.1213	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505151218.1213 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 15:13	Success	-	View
exp_self.20260505150506.1212_20260505_150507 Paper: self.20260505150506.1212	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505150506.1212 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 15:06	Success	-	View
exp_pytrain.20260505150136.302_20260505_150137 Paper: pytrain.20260505150136.302	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 15:02	Success	-	View
exp_self.20260505145458.1211_20260505_145459 Paper: self.20260505145458.1211	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505145458.1211 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 14:56	Success	-	View
exp_self.20260505144741.1210_20260505_144741 Paper: self.20260505144741.1210	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505144741.1210 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 14:48	Success	-	View
exp_self.20260505144024.1209_20260505_144025 Paper: self.20260505144024.1209	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505144024.1209 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 14:41	Success	-	View
exp_self.20260505143316.1208_20260505_143316 Paper: self.20260505143316.1208	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505143316.1208 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 14:34	Success	-	View
exp_pytrain.20260505142950.301_20260505_142950 Paper: pytrain.20260505142950.301	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 14:30	Success	-	View
exp_hf_2605.01711_20260505_142634 Paper: hf_2605.01711	Linear-Time Global Visual Modeling without Explicit Attention Paper ID: hf_2605.01711 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 14:27	Success	-	View
exp_self.20260505142223.1207_20260505_142223 Paper: self.20260505142223.1207	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505142223.1207 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 14:23	Success	-	View
exp_self.20260505141506.1206_20260505_141506 Paper: self.20260505141506.1206	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505141506.1206 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 14:16	Success	-	View
exp_self.20260505140750.1205_20260505_140750 Paper: self.20260505140750.1205	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505140750.1205 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 14:08	Success	-	View
exp_self.20260505140035.1204_20260505_140035 Paper: self.20260505140035.1204	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505140035.1204 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 14:01	Success	-	View
exp_pytrain.20260505135709.300_20260505_135710 Paper: pytrain.20260505135709.300	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 13:58	Success	-	View
exp_self.20260505135028.1203_20260505_135029 Paper: self.20260505135028.1203	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505135028.1203 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 13:51	Success	-	View
exp_self.20260505134315.1202_20260505_134315 Paper: self.20260505134315.1202	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505134315.1202 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 13:44	Success	-	View
exp_self.20260505133606.1201_20260505_133607 Paper: self.20260505133606.1201	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505133606.1201 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 13:37	Success	-	View
exp_self.20260505132851.1200_20260505_132851 Paper: self.20260505132851.1200	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505132851.1200 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 13:29	Success	-	View
exp_pytrain.20260505132526.299_20260505_132526 Paper: pytrain.20260505132526.299	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 13:26	Success	-	View
exp_self.20260505132127.1199_20260505_132127 Paper: self.20260505132127.1199	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505132127.1199 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 13:22	Success	-	View
exp_self.20260505131410.1198_20260505_131410 Paper: self.20260505131410.1198	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505131410.1198 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 13:15	Success	-	View
exp_self.20260505130652.1197_20260505_130652 Paper: self.20260505130652.1197	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505130652.1197 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 13:07	Success	-	View
exp_self.20260505125943.1196_20260505_125943 Paper: self.20260505125943.1196	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505125943.1196 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 13:00	Success	-	View
exp_hf_2605.00632_20260505_125613 Paper: hf_2605.00632	BlenderRAG: High-Fidelity 3D Object Generation via Retrieval-Augmented Code Synthesis Paper ID: hf_2605.00632 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 12:57	Success	-	View
exp_pytrain.20260505125319.298_20260505_125319 Paper: pytrain.20260505125319.298	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 12:54	Success	-	View
exp_self.20260505124633.1195_20260505_124634 Paper: self.20260505124633.1195	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505124633.1195 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 12:47	Success	-	View
exp_self.20260505123923.1194_20260505_123923 Paper: self.20260505123923.1194	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505123923.1194 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 12:40	Success	-	View
exp_self.20260505123207.1193_20260505_123207 Paper: self.20260505123207.1193	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505123207.1193 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 12:33	Success	-	View
exp_self.20260505122446.1192_20260505_122447 Paper: self.20260505122446.1192	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505122446.1192 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 12:25	Success	-	View
exp_pytrain.20260505122124.297_20260505_122124 Paper: pytrain.20260505122124.297	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 12:22	Success	-	View
exp_self.20260505121450.1191_20260505_121451 Paper: self.20260505121450.1191	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505121450.1191 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 12:15	Success	-	View
exp_self.20260505120709.1190_20260505_120709 Paper: self.20260505120709.1190	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505120709.1190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 12:08	Success	-	View
exp_self.20260505115930.1189_20260505_115930 Paper: self.20260505115930.1189	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505115930.1189 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 12:00	Success	-	View
exp_self.20260505115158.1188_20260505_115158 Paper: self.20260505115158.1188	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505115158.1188 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 11:53	Success	-	View
exp_pytrain.20260505114922.296_20260505_114923 Paper: pytrain.20260505114922.296	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 11:50	Success	-	View
exp_self.20260505114322.1187_20260505_114323 Paper: self.20260505114322.1187	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505114322.1187 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 11:44	Success	-	View
exp_self.20260505113549.1186_20260505_113549 Paper: self.20260505113549.1186	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505113549.1186 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 11:36	Success	-	View
exp_self.20260505112816.1185_20260505_112816 Paper: self.20260505112816.1185	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505112816.1185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 11:29	Success	-	View
exp_self.20260505112043.1184_20260505_112044 Paper: self.20260505112043.1184	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505112043.1184 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 11:21	Success	-	View
exp_pytrain.20260505111804.295_20260505_111805 Paper: pytrain.20260505111804.295	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 11:19	Success	-	View
exp_self.20260505111102.1183_20260505_111102 Paper: self.20260505111102.1183	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505111102.1183 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 11:12	Success	-	View
exp_self.20260505110310.1182_20260505_110310 Paper: self.20260505110310.1182	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505110310.1182 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 11:04	Success	-	View
exp_self.20260505105531.1181_20260505_105532 Paper: self.20260505105531.1181	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505105531.1181 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 10:56	Success	-	View
exp_self.20260505104756.1180_20260505_104757 Paper: self.20260505104756.1180	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505104756.1180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 10:48	Success	-	View
exp_pytrain.20260505104520.294_20260505_104521 Paper: pytrain.20260505104520.294	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 10:46	Success	-	View
exp_self.20260505103817.1179_20260505_103817 Paper: self.20260505103817.1179	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505103817.1179 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 10:39	Success	-	View
exp_self.20260505103031.1178_20260505_103031 Paper: self.20260505103031.1178	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505103031.1178 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 10:31	Success	-	View
exp_self.20260505102318.1177_20260505_102318 Paper: self.20260505102318.1177	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505102318.1177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 10:24	Success	-	View
exp_self.20260505101559.1176_20260505_101559 Paper: self.20260505101559.1176	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505101559.1176 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 10:17	Success	-	View
exp_pytrain.20260505101229.293_20260505_101230 Paper: pytrain.20260505101229.293	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 10:13	Success	-	View
exp_self.20260505100542.1175_20260505_100542 Paper: self.20260505100542.1175	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505100542.1175 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 10:06	Success	-	View
exp_self.20260505095822.1174_20260505_095823 Paper: self.20260505095822.1174	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505095822.1174 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 09:59	Success	-	View
exp_self.20260505095107.1173_20260505_095107 Paper: self.20260505095107.1173	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505095107.1173 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 09:52	Success	-	View
exp_self.20260505094352.1172_20260505_094353 Paper: self.20260505094352.1172	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505094352.1172 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 09:44	Success	-	View
exp_pytrain.20260505094022.292_20260505_094023 Paper: pytrain.20260505094022.292	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 09:41	Success	-	View
exp_self.20260505093338.1171_20260505_093338 Paper: self.20260505093338.1171	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505093338.1171 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 09:34	Success	-	View
exp_self.20260505092621.1170_20260505_092621 Paper: self.20260505092621.1170	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505092621.1170 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 09:27	Success	-	View
exp_self.20260505091800.1169_20260505_091800 Paper: self.20260505091800.1169	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505091800.1169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 09:19	Success	-	View
exp_self.20260505091039.1168_20260505_091039 Paper: self.20260505091039.1168	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505091039.1168 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 09:11	Success	-	View
exp_pytrain.20260505090719.291_20260505_090720 Paper: pytrain.20260505090719.291	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 09:08	Success	-	View
exp_self.20260505090120.1167_20260505_090121 Paper: self.20260505090120.1167	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505090120.1167 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 09:02	Success	-	View
exp_self.20260505085333.1166_20260505_085334 Paper: self.20260505085333.1166	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505085333.1166 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 08:54	Success	-	View
exp_self.20260505084553.1165_20260505_084553 Paper: self.20260505084553.1165	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505084553.1165 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 08:46	Success	-	View
exp_self.20260505083820.1164_20260505_083821 Paper: self.20260505083820.1164	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505083820.1164 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 08:39	Success	-	View
exp_pytrain.20260505083543.290_20260505_083544 Paper: pytrain.20260505083543.290	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 08:36	Success	-	View
exp_self.20260505083124.1163_20260505_083125 Paper: self.20260505083124.1163	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505083124.1163 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 08:32	Success	-	View
exp_self.20260505082348.1162_20260505_082348 Paper: self.20260505082348.1162	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505082348.1162 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 08:24	Success	-	View
exp_self.20260505081547.1161_20260505_081547 Paper: self.20260505081547.1161	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505081547.1161 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 08:16	Success	-	View
exp_self.20260505080828.1160_20260505_080829 Paper: self.20260505080828.1160	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505080828.1160 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 08:09	Success	-	View
exp_pytrain.20260505080351.289_20260505_080351 Paper: pytrain.20260505080351.289	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 08:04	Success	-	View
exp_self.20260505080100.1159_20260505_080100 Paper: self.20260505080100.1159	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505080100.1159 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 08:02	Success	-	View
exp_self.20260505075417.1158_20260505_075417 Paper: self.20260505075417.1158	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505075417.1158 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 07:55	Success	-	View
exp_self.20260505074704.1157_20260505_074704 Paper: self.20260505074704.1157	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505074704.1157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 07:48	Success	-	View
exp_self.20260505073942.1156_20260505_073943 Paper: self.20260505073942.1156	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505073942.1156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 07:40	Success	-	View
exp_self.20260505073206.1155_20260505_073207 Paper: self.20260505073206.1155	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505073206.1155 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 07:33	Success	-	View
exp_pytrain.20260505072932.288_20260505_072932 Paper: pytrain.20260505072932.288	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 07:30	Success	-	View
exp_self.20260505072228.1154_20260505_072228 Paper: self.20260505072228.1154	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505072228.1154 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 07:23	Success	-	View
exp_self.20260505071445.1153_20260505_071446 Paper: self.20260505071445.1153	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505071445.1153 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 07:15	Success	-	View
exp_self.20260505070706.1152_20260505_070706 Paper: self.20260505070706.1152	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505070706.1152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 07:08	Success	-	View
exp_self.20260505065933.1151_20260505_065934 Paper: self.20260505065933.1151	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505065933.1151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 07:00	Success	-	View
exp_pytrain.20260505065659.287_20260505_065700 Paper: pytrain.20260505065659.287	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 06:58	Success	-	View
exp_self.20260505064958.1150_20260505_064958 Paper: self.20260505064958.1150	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505064958.1150 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 06:51	Success	-	View
exp_self.20260505064217.1149_20260505_064218 Paper: self.20260505064217.1149	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505064217.1149 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 06:43	Success	-	View
exp_self.20260505063437.1148_20260505_063437 Paper: self.20260505063437.1148	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505063437.1148 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 06:35	Success	-	View
exp_self.20260505062706.1147_20260505_062706 Paper: self.20260505062706.1147	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505062706.1147 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 06:28	Success	-	View
exp_pytrain.20260505062433.286_20260505_062434 Paper: pytrain.20260505062433.286	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 06:25	Success	-	View
exp_self.20260505061728.1146_20260505_061729 Paper: self.20260505061728.1146	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505061728.1146 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 06:18	Success	-	View
exp_self.20260505060952.1145_20260505_060953 Paper: self.20260505060952.1145	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505060952.1145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 06:10	Success	-	View
exp_self.20260505060221.1144_20260505_060221 Paper: self.20260505060221.1144	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505060221.1144 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 06:03	Success	-	View
exp_self.20260505055437.1143_20260505_055437 Paper: self.20260505055437.1143	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505055437.1143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 05:55	Success	-	View
exp_pytrain.20260505055159.285_20260505_055200 Paper: pytrain.20260505055159.285	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 05:53	Success	-	View
exp_self.20260505054742.1142_20260505_054742 Paper: self.20260505054742.1142	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505054742.1142 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 05:48	Success	-	View
exp_self.20260505053851.1141_20260505_053852 Paper: self.20260505053851.1141	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505053851.1141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 05:39	Success	-	View
exp_self.20260505053028.1140_20260505_053028 Paper: self.20260505053028.1140	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505053028.1140 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 05:31	Success	-	View
exp_self.20260505052241.1139_20260505_052241 Paper: self.20260505052241.1139	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505052241.1139 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 05:23	Success	-	View
exp_pytrain.20260505052013.284_20260505_052013 Paper: pytrain.20260505052013.284	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 05:21	Success	-	View
exp_self.20260505051311.1138_20260505_051311 Paper: self.20260505051311.1138	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505051311.1138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 05:14	Success	-	View
exp_self.20260505050534.1137_20260505_050535 Paper: self.20260505050534.1137	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505050534.1137 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 05:06	Success	-	View
exp_self.20260505045805.1136_20260505_045806 Paper: self.20260505045805.1136	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505045805.1136 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 04:59	Success	-	View
exp_hf_2605.00814_20260505_045230 Paper: hf_2605.00814	Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs Paper ID: hf_2605.00814 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 04:53	Success	-	View
exp_self.20260505045019.1135_20260505_045019 Paper: self.20260505045019.1135	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505045019.1135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 04:51	Success	-	View
exp_pytrain.20260505044746.283_20260505_044747 Paper: pytrain.20260505044746.283	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 04:48	Success	-	View
exp_self.20260505044054.1134_20260505_044055 Paper: self.20260505044054.1134	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505044054.1134 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 04:41	Success	-	View
exp_self.20260505043319.1133_20260505_043319 Paper: self.20260505043319.1133	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505043319.1133 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 04:34	Success	-	View
exp_self.20260505042547.1132_20260505_042548 Paper: self.20260505042547.1132	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505042547.1132 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 04:26	Success	-	View
exp_cr_10.1093_nar_gkag425_20260505_042233 Paper: cr_10.1093_nar_gkag425	xBind: an integrated webserver for large language model-enabled cross-molecular protein binding site prediction Paper ID: cr_10.1093_nar_gkag425 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	05-05 04:23	Success	-	View
exp_self.20260505041744.1131_20260505_041745 Paper: self.20260505041744.1131	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505041744.1131 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 04:18	Success	-	View
exp_pytrain.20260505041514.282_20260505_041514 Paper: pytrain.20260505041514.282	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 04:16	Success	-	View
exp_cr_10.3390_fi18050243_20260505_041224 Paper: cr_10.3390_fi18050243	The Trustworthy Model Context Protocol (MCP) Registry: An Architectural Blueprint for Cryptographic Provenance and Runti... Paper ID: cr_10.3390_fi18050243 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered be...	05-05 04:13	Success	-	View
exp_self.20260505040909.1130_20260505_040910 Paper: self.20260505040909.1130	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505040909.1130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 04:10	Success	-	View
exp_self.20260505040132.1129_20260505_040132 Paper: self.20260505040132.1129	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505040132.1129 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 04:02	Success	-	View
exp_self.20260505035353.1128_20260505_035353 Paper: self.20260505035353.1128	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505035353.1128 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 03:54	Success	-	View
exp_self.20260505034623.1127_20260505_034623 Paper: self.20260505034623.1127	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505034623.1127 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 03:47	Success	-	View
exp_pytrain.20260505034356.281_20260505_034357 Paper: pytrain.20260505034356.281	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 03:44	Success	-	View
exp_self.20260505033646.1126_20260505_033646 Paper: self.20260505033646.1126	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505033646.1126 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 03:37	Success	-	View
exp_self.20260505032913.1125_20260505_032913 Paper: self.20260505032913.1125	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505032913.1125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 03:30	Success	-	View
exp_self.20260505032127.1124_20260505_032127 Paper: self.20260505032127.1124	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505032127.1124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 03:22	Success	-	View
exp_self.20260505031355.1123_20260505_031356 Paper: self.20260505031355.1123	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505031355.1123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 03:14	Success	-	View
exp_pytrain.20260505031129.280_20260505_031130 Paper: pytrain.20260505031129.280	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 03:12	Success	-	View
exp_self.20260505030522.1122_20260505_030523 Paper: self.20260505030522.1122	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505030522.1122 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 03:06	Success	-	View
exp_self.20260505025753.1121_20260505_025753 Paper: self.20260505025753.1121	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505025753.1121 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 02:58	Success	-	View
exp_hf_2605.00529_20260505_025216 Paper: hf_2605.00529	Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation Paper ID: hf_2605.00529 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-05 02:53	Success	-	View
exp_self.20260505025012.1120_20260505_025013 Paper: self.20260505025012.1120	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505025012.1120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 02:51	Success	-	View
exp_self.20260505024241.1119_20260505_024241 Paper: self.20260505024241.1119	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505024241.1119 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 02:43	Success	-	View
exp_pytrain.20260505024006.279_20260505_024006 Paper: pytrain.20260505024006.279	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 02:41	Success	-	View
exp_self.20260505023303.1118_20260505_023303 Paper: self.20260505023303.1118	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505023303.1118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 02:34	Success	-	View
exp_self.20260505022529.1117_20260505_022529 Paper: self.20260505022529.1117	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505022529.1117 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 02:26	Success	-	View
exp_self.20260505021759.1116_20260505_021759 Paper: self.20260505021759.1116	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505021759.1116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 02:19	Success	-	View
exp_self.20260505021032.1115_20260505_021032 Paper: self.20260505021032.1115	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505021032.1115 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 02:11	Success	-	View
exp_pytrain.20260505020742.278_20260505_020743 Paper: pytrain.20260505020742.278	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 02:08	Success	-	View
exp_self.20260505020052.1114_20260505_020052 Paper: self.20260505020052.1114	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505020052.1114 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 02:01	Success	-	View
exp_self.20260505015316.1113_20260505_015316 Paper: self.20260505015316.1113	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505015316.1113 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 01:54	Success	-	View
exp_self.20260505014550.1112_20260505_014550 Paper: self.20260505014550.1112	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505014550.1112 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 01:46	Success	-	View
exp_self.20260505013822.1111_20260505_013823 Paper: self.20260505013822.1111	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505013822.1111 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 01:39	Success	-	View
exp_pytrain.20260505013550.277_20260505_013551 Paper: pytrain.20260505013550.277	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 01:36	Success	-	View
exp_self.20260505012901.1110_20260505_012901 Paper: self.20260505012901.1110	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505012901.1110 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 01:30	Success	-	View
exp_self.20260505012126.1109_20260505_012127 Paper: self.20260505012126.1109	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505012126.1109 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 01:22	Success	-	View
exp_self.20260505011359.1108_20260505_011400 Paper: self.20260505011359.1108	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505011359.1108 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 01:15	Success	-	View
exp_self.20260505010635.1107_20260505_010635 Paper: self.20260505010635.1107	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505010635.1107 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 01:07	Success	-	View
exp_pytrain.20260505010408.276_20260505_010408 Paper: pytrain.20260505010408.276	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 01:05	Success	-	View
exp_self.20260505005709.1106_20260505_005710 Paper: self.20260505005709.1106	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505005709.1106 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 00:58	Success	-	View
exp_self.20260505004941.1105_20260505_004941 Paper: self.20260505004941.1105	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505004941.1105 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 00:50	Success	-	View
exp_self.20260505004211.1104_20260505_004212 Paper: self.20260505004211.1104	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505004211.1104 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 00:43	Success	-	View
exp_self.20260505003445.1103_20260505_003445 Paper: self.20260505003445.1103	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505003445.1103 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 00:35	Success	-	View
exp_pytrain.20260505003217.275_20260505_003217 Paper: pytrain.20260505003217.275	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 00:33	Success	-	View
exp_self.20260505002516.1102_20260505_002517 Paper: self.20260505002516.1102	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505002516.1102 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 00:26	Success	-	View
exp_gh_Edgarzp12_realtime-sentiment-pipeline_20260505_001949 Paper: gh_Edgarzp12_realtime-sentiment-pipeline	Edgarzp12/realtime-sentiment-pipeline Paper ID: gh_Edgarzp12_realtime-sentiment-pipeline - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected S...	05-05 00:20	Success	-	View
exp_self.20260505001740.1101_20260505_001740 Paper: self.20260505001740.1101	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505001740.1101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 00:18	Success	-	View
exp_self.20260505001010.1100_20260505_001010 Paper: self.20260505001010.1100	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505001010.1100 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 00:11	Success	-	View
exp_self.20260505000326.1099_20260505_000326 Paper: self.20260505000326.1099	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260505000326.1099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-05 00:04	Success	-	View
exp_pytrain.20260505000054.274_20260505_000054 Paper: pytrain.20260505000054.274	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-05 00:01	Success	-	View
exp_self.20260504235401.1098_20260504_235402 Paper: self.20260504235401.1098	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504235401.1098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 23:55	Success	-	View
exp_self.20260504234636.1097_20260504_234636 Paper: self.20260504234636.1097	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504234636.1097 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 23:47	Success	-	View
exp_self.20260504233909.1096_20260504_233909 Paper: self.20260504233909.1096	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504233909.1096 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 23:40	Success	-	View
exp_self.20260504233144.1095_20260504_233144 Paper: self.20260504233144.1095	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504233144.1095 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 23:32	Success	-	View
exp_pytrain.20260504232909.273_20260504_232910 Paper: pytrain.20260504232909.273	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 23:30	Success	-	View
exp_self.20260504232208.1094_20260504_232209 Paper: self.20260504232208.1094	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504232208.1094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 23:23	Success	-	View
exp_self.20260504231445.1093_20260504_231446 Paper: self.20260504231445.1093	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504231445.1093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 23:15	Success	-	View
exp_self.20260504230716.1092_20260504_230716 Paper: self.20260504230716.1092	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504230716.1092 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 23:08	Success	-	View
exp_self.20260504225953.1091_20260504_225954 Paper: self.20260504225953.1091	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504225953.1091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 23:00	Success	-	View
exp_pytrain.20260504225721.272_20260504_225721 Paper: pytrain.20260504225721.272	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 22:58	Success	-	View
exp_self.20260504225305.1090_20260504_225305 Paper: self.20260504225305.1090	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504225305.1090 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 22:54	Success	-	View
exp_2605.02884v1_20260504_224951 Paper: 2605.02884v1	Unsupervised Machine Learning for Detecting Structural Anomalies in European Regional Statistics Paper ID: 2605.02884v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-04 22:50	Success	-	View
exp_self.20260504224423.1089_20260504_224423 Paper: self.20260504224423.1089	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504224423.1089 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 22:45	Success	-	View
exp_hf_2605.02881_20260504_224000 Paper: hf_2605.02881	MolmoAct2: Action Reasoning Models for Real-world Deployment Paper ID: hf_2605.02881 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-04 22:41	Success	-	View
exp_self.20260504223647.1088_20260504_223648 Paper: self.20260504223647.1088	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504223647.1088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 22:37	Success	-	View
exp_2605.02888v1_20260504_223358 Paper: 2605.02888v1	SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection Paper ID: 2605.02888v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-04 22:35	Success	-	View
exp_cr_10.18664_1994-7852.215.2026.358845_20260504_223110 Paper: cr_10.18664_1994-7852.215.2026.358845	IMPROVEMENT OF CARGO ROUTING TECHNOLOGY AT A CONTAINER HAB USING A COMPREHENSIVE MATHEMATICAL MODEL Paper ID: cr_10.18664_1994-7852.215.2026.358845 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Sign...	05-04 22:32	Success	-	View
exp_self.20260504222756.1087_20260504_222756 Paper: self.20260504222756.1087	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504222756.1087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 22:28	Success	-	View
exp_pytrain.20260504222524.271_20260504_222524 Paper: pytrain.20260504222524.271	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 22:26	Success	-	View
exp_hf_2605.02222_20260504_222238 Paper: hf_2605.02222	Generative Modeling with Orbit-Space Particle Flow Matching Paper ID: hf_2605.02222 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-04 22:23	Success	-	View
exp_self.20260504221820.1086_20260504_221820 Paper: self.20260504221820.1086	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504221820.1086 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 22:19	Success	-	View
exp_self.20260504221054.1085_20260504_221055 Paper: self.20260504221054.1085	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504221054.1085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 22:11	Success	-	View
exp_self.20260504220326.1084_20260504_220327 Paper: self.20260504220326.1084	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504220326.1084 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 22:04	Success	-	View
exp_cr_10.3390_vehicles8050101_20260504_215908 Paper: cr_10.3390_vehicles8050101	A Vehicle Type Recognition Network Based on Feature Comparison and Mixture of Experts Model Paper ID: cr_10.3390_vehicles8050101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recover...	05-04 22:00	Success	-	View
exp_self.20260504215555.1083_20260504_215555 Paper: self.20260504215555.1083	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504215555.1083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 21:56	Success	-	View
exp_pytrain.20260504215323.270_20260504_215323 Paper: pytrain.20260504215323.270	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 21:54	Success	-	View
exp_self.20260504214632.1082_20260504_214632 Paper: self.20260504214632.1082	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504214632.1082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 21:47	Success	-	View
exp_self.20260504213909.1081_20260504_213909 Paper: self.20260504213909.1081	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504213909.1081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 21:40	Success	-	View
exp_2605.02866v1_20260504_213342 Paper: 2605.02866v1	Laplacian Frequency Interaction Network for Rural Thematic Road Extraction Paper ID: 2605.02866v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-04 21:34	Success	-	View
exp_self.20260504213134.1080_20260504_213135 Paper: self.20260504213134.1080	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504213134.1080 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 21:32	Success	-	View
exp_2605.02860v1_20260504_212820 Paper: 2605.02860v1	Standing on the Shoulders of Giants: Stabilized Knowledge Distillation for Cross--Language Code Clone Detection Paper ID: 2605.02860v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-04 21:29	Success	-	View
exp_self.20260504212405.1079_20260504_212405 Paper: self.20260504212405.1079	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504212405.1079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 21:25	Success	-	View
exp_pytrain.20260504212134.269_20260504_212134 Paper: pytrain.20260504212134.269	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 21:22	Success	-	View
exp_self.20260504211616.1078_20260504_211616 Paper: self.20260504211616.1078	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504211616.1078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 21:17	Success	-	View
exp_hf_2604.27660_20260504_211154 Paper: hf_2604.27660	From Context to Skills: Can Language Models Learn from Context Skillfully? Paper ID: hf_2604.27660 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-04 21:12	Success	-	View
exp_self.20260504210841.1077_20260504_210842 Paper: self.20260504210841.1077	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504210841.1077 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 21:09	Success	-	View
exp_self.20260504210116.1076_20260504_210116 Paper: self.20260504210116.1076	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504210116.1076 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 21:02	Success	-	View
exp_cr_10.1007_s42452-026-08699-7_20260504_205758 Paper: cr_10.1007_s42452-026-08699-7	A swin transformer enhanced reverse knowledge distillation model for industrial anomaly detection via window-aware stoch... Paper ID: cr_10.1007_s42452-026-08699-7 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	05-04 20:59	Success	-	View
exp_self.20260504205234.1075_20260504_205235 Paper: self.20260504205234.1075	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504205234.1075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 20:53	Success	-	View
exp_pytrain.20260504205001.268_20260504_205002 Paper: pytrain.20260504205001.268	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 20:51	Success	-	View
exp_self.20260504204308.1074_20260504_204309 Paper: self.20260504204308.1074	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504204308.1074 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 20:44	Success	-	View
exp_self.20260504203539.1073_20260504_203540 Paper: self.20260504203539.1073	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504203539.1073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 20:36	Success	-	View
exp_self.20260504202811.1072_20260504_202811 Paper: self.20260504202811.1072	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504202811.1072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 20:29	Success	-	View
exp_self.20260504202111.1071_20260504_202111 Paper: self.20260504202111.1071	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504202111.1071 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 20:22	Success	-	View
exp_pytrain.20260504201843.267_20260504_201844 Paper: pytrain.20260504201843.267	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 20:19	Success	-	View
exp_self.20260504201141.1070_20260504_201141 Paper: self.20260504201141.1070	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504201141.1070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 20:12	Success	-	View
exp_self.20260504200416.1069_20260504_200416 Paper: self.20260504200416.1069	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504200416.1069 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 20:05	Success	-	View
exp_self.20260504195649.1068_20260504_195650 Paper: self.20260504195649.1068	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504195649.1068 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 19:57	Success	-	View
exp_self.20260504194916.1067_20260504_194917 Paper: self.20260504194916.1067	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504194916.1067 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 19:50	Success	-	View
exp_pytrain.20260504194648.266_20260504_194648 Paper: pytrain.20260504194648.266	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 19:47	Success	-	View
exp_self.20260504193946.1066_20260504_193946 Paper: self.20260504193946.1066	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504193946.1066 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 19:40	Success	-	View
exp_self.20260504193218.1065_20260504_193218 Paper: self.20260504193218.1065	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504193218.1065 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 19:33	Success	-	View
exp_self.20260504192449.1064_20260504_192450 Paper: self.20260504192449.1064	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504192449.1064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 19:25	Success	-	View
exp_self.20260504191709.1063_20260504_191709 Paper: self.20260504191709.1063	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504191709.1063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 19:18	Success	-	View
exp_pytrain.20260504191435.265_20260504_191435 Paper: pytrain.20260504191435.265	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 19:15	Success	-	View
exp_self.20260504190727.1062_20260504_190727 Paper: self.20260504190727.1062	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504190727.1062 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 19:08	Success	-	View
exp_self.20260504185951.1061_20260504_185952 Paper: self.20260504185951.1061	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504185951.1061 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 19:00	Success	-	View
exp_self.20260504185215.1060_20260504_185215 Paper: self.20260504185215.1060	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504185215.1060 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 18:53	Success	-	View
exp_self.20260504184438.1059_20260504_184438 Paper: self.20260504184438.1059	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504184438.1059 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 18:45	Success	-	View
exp_pytrain.20260504184155.264_20260504_184155 Paper: pytrain.20260504184155.264	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 18:42	Success	-	View
exp_self.20260504183448.1058_20260504_183448 Paper: self.20260504183448.1058	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504183448.1058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 18:35	Success	-	View
exp_self.20260504182710.1057_20260504_182711 Paper: self.20260504182710.1057	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504182710.1057 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 18:28	Success	-	View
exp_self.20260504181933.1056_20260504_181933 Paper: self.20260504181933.1056	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504181933.1056 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 18:20	Success	-	View
exp_self.20260504181208.1055_20260504_181209 Paper: self.20260504181208.1055	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504181208.1055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 18:13	Success	-	View
exp_pytrain.20260504180936.263_20260504_180937 Paper: pytrain.20260504180936.263	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 18:10	Success	-	View
exp_self.20260504180231.1054_20260504_180231 Paper: self.20260504180231.1054	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504180231.1054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 18:03	Success	-	View
exp_self.20260504175459.1053_20260504_175459 Paper: self.20260504175459.1053	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504175459.1053 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 17:56	Success	-	View
exp_self.20260504174719.1052_20260504_174720 Paper: self.20260504174719.1052	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504174719.1052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 17:48	Success	-	View
exp_self.20260504173946.1051_20260504_173946 Paper: self.20260504173946.1051	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504173946.1051 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 17:40	Success	-	View
exp_pytrain.20260504173714.262_20260504_173714 Paper: pytrain.20260504173714.262	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 17:38	Success	-	View
exp_self.20260504173009.1050_20260504_173009 Paper: self.20260504173009.1050	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504173009.1050 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 17:31	Success	-	View
exp_self.20260504172233.1049_20260504_172233 Paper: self.20260504172233.1049	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504172233.1049 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 17:23	Success	-	View
exp_self.20260504171504.1048_20260504_171504 Paper: self.20260504171504.1048	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504171504.1048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 17:16	Success	-	View
exp_self.20260504170722.1047_20260504_170723 Paper: self.20260504170722.1047	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504170722.1047 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 17:08	Success	-	View
exp_pytrain.20260504170447.261_20260504_170447 Paper: pytrain.20260504170447.261	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 17:05	Success	-	View
exp_self.20260504165739.1046_20260504_165739 Paper: self.20260504165739.1046	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504165739.1046 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 16:58	Success	-	View
exp_self.20260504165007.1045_20260504_165007 Paper: self.20260504165007.1045	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504165007.1045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 16:51	Success	-	View
exp_self.20260504164237.1044_20260504_164237 Paper: self.20260504164237.1044	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504164237.1044 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 16:43	Success	-	View
exp_hf_2605.00347_20260504_163916 Paper: hf_2605.00347	Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning Paper ID: hf_2605.00347 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-04 16:40	Success	-	View
exp_self.20260504163455.1043_20260504_163456 Paper: self.20260504163455.1043	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504163455.1043 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 16:35	Success	-	View
exp_pytrain.20260504163224.260_20260504_163224 Paper: pytrain.20260504163224.260	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 16:33	Success	-	View
exp_self.20260504162518.1042_20260504_162518 Paper: self.20260504162518.1042	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504162518.1042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 16:26	Success	-	View
exp_self.20260504161749.1041_20260504_161750 Paper: self.20260504161749.1041	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504161749.1041 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 16:18	Success	-	View
exp_self.20260504161021.1040_20260504_161021 Paper: self.20260504161021.1040	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504161021.1040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 16:11	Success	-	View
exp_self.20260504160329.1039_20260504_160329 Paper: self.20260504160329.1039	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504160329.1039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 16:04	Success	-	View
exp_pytrain.20260504160101.259_20260504_160101 Paper: pytrain.20260504160101.259	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 16:02	Success	-	View
exp_self.20260504155357.1038_20260504_155358 Paper: self.20260504155357.1038	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504155357.1038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 15:55	Success	-	View
exp_self.20260504154628.1037_20260504_154629 Paper: self.20260504154628.1037	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504154628.1037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 15:47	Success	-	View
exp_self.20260504153855.1036_20260504_153856 Paper: self.20260504153855.1036	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504153855.1036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 15:39	Success	-	View
exp_self.20260504153126.1035_20260504_153126 Paper: self.20260504153126.1035	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504153126.1035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 15:32	Success	-	View
exp_pytrain.20260504152900.258_20260504_152900 Paper: pytrain.20260504152900.258	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 15:30	Success	-	View
exp_self.20260504152432.1034_20260504_152433 Paper: self.20260504152432.1034	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504152432.1034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 15:25	Success	-	View
exp_self.20260504151701.1033_20260504_151701 Paper: self.20260504151701.1033	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504151701.1033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 15:18	Success	-	View
exp_self.20260504150933.1032_20260504_150934 Paper: self.20260504150933.1032	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504150933.1032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 15:10	Success	-	View
exp_self.20260504150142.1031_20260504_150142 Paper: self.20260504150142.1031	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504150142.1031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 15:02	Success	-	View
exp_pytrain.20260504145739.257_20260504_145740 Paper: pytrain.20260504145739.257	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 14:58	Success	-	View
exp_self.20260504145034.1030_20260504_145034 Paper: self.20260504145034.1030	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504145034.1030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 14:51	Success	-	View
exp_self.20260504144253.1029_20260504_144253 Paper: self.20260504144253.1029	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504144253.1029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 14:43	Success	-	View
exp_hf_2604.27818_20260504_143825 Paper: hf_2604.27818	MASCing: Configurable Mixture-of-Experts Behavior via Activation Steering Masks Paper ID: hf_2604.27818 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-04 14:39	Success	-	View
exp_self.20260504143510.1028_20260504_143511 Paper: self.20260504143510.1028	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504143510.1028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 14:36	Success	-	View
exp_self.20260504142734.1027_20260504_142734 Paper: self.20260504142734.1027	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504142734.1027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 14:28	Success	-	View
exp_pytrain.20260504142500.256_20260504_142501 Paper: pytrain.20260504142500.256	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 14:26	Success	-	View
exp_self.20260504141745.1026_20260504_141745 Paper: self.20260504141745.1026	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504141745.1026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 14:18	Success	-	View
exp_self.20260504141008.1025_20260504_141008 Paper: self.20260504141008.1025	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504141008.1025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 14:11	Success	-	View
exp_self.20260504140239.1024_20260504_140239 Paper: self.20260504140239.1024	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504140239.1024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 14:03	Success	-	View
exp_self.20260504135503.1023_20260504_135504 Paper: self.20260504135503.1023	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504135503.1023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 13:56	Success	-	View
exp_pytrain.20260504135235.255_20260504_135236 Paper: pytrain.20260504135235.255	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 13:53	Success	-	View
exp_self.20260504134534.1022_20260504_134535 Paper: self.20260504134534.1022	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504134534.1022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 13:46	Success	-	View
exp_self.20260504133806.1021_20260504_133806 Paper: self.20260504133806.1021	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504133806.1021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 13:39	Success	-	View
exp_self.20260504133036.1020_20260504_133037 Paper: self.20260504133036.1020	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504133036.1020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 13:31	Success	-	View
exp_self.20260504132301.1019_20260504_132302 Paper: self.20260504132301.1019	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504132301.1019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 13:24	Success	-	View
exp_pytrain.20260504132031.254_20260504_132032 Paper: pytrain.20260504132031.254	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 13:21	Success	-	View
exp_self.20260504131330.1018_20260504_131331 Paper: self.20260504131330.1018	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504131330.1018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 13:14	Success	-	View
exp_self.20260504130601.1017_20260504_130601 Paper: self.20260504130601.1017	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504130601.1017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 13:07	Success	-	View
exp_self.20260504125830.1016_20260504_125830 Paper: self.20260504125830.1016	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504125830.1016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 12:59	Success	-	View
exp_self.20260504125100.1015_20260504_125100 Paper: self.20260504125100.1015	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504125100.1015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 12:52	Success	-	View
exp_pytrain.20260504124826.253_20260504_124826 Paper: pytrain.20260504124826.253	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 12:49	Success	-	View
exp_self.20260504124125.1014_20260504_124125 Paper: self.20260504124125.1014	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504124125.1014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 12:42	Success	-	View
exp_self.20260504123351.1013_20260504_123352 Paper: self.20260504123351.1013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504123351.1013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 12:34	Success	-	View
exp_self.20260504122622.1012_20260504_122622 Paper: self.20260504122622.1012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504122622.1012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 12:27	Success	-	View
exp_self.20260504121853.1011_20260504_121853 Paper: self.20260504121853.1011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504121853.1011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 12:19	Success	-	View
exp_pytrain.20260504121618.252_20260504_121618 Paper: pytrain.20260504121618.252	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 12:17	Success	-	View
exp_self.20260504120917.1010_20260504_120918 Paper: self.20260504120917.1010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504120917.1010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 12:10	Success	-	View
exp_self.20260504120145.1009_20260504_120145 Paper: self.20260504120145.1009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504120145.1009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 12:02	Success	-	View
exp_self.20260504115415.1008_20260504_115416 Paper: self.20260504115415.1008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504115415.1008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 11:55	Success	-	View
exp_self.20260504114645.1007_20260504_114646 Paper: self.20260504114645.1007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504114645.1007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 11:47	Success	-	View
exp_pytrain.20260504114411.251_20260504_114411 Paper: pytrain.20260504114411.251	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 11:45	Success	-	View
exp_self.20260504113715.1006_20260504_113716 Paper: self.20260504113715.1006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504113715.1006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 11:38	Success	-	View
exp_self.20260504112931.1005_20260504_112931 Paper: self.20260504112931.1005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504112931.1005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 11:30	Success	-	View
exp_self.20260504112151.1004_20260504_112152 Paper: self.20260504112151.1004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504112151.1004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 11:22	Success	-	View
exp_self.20260504111417.1003_20260504_111417 Paper: self.20260504111417.1003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504111417.1003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 11:15	Success	-	View
exp_pytrain.20260504111138.250_20260504_111139 Paper: pytrain.20260504111138.250	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 11:12	Success	-	View
exp_self.20260504110432.1002_20260504_110432 Paper: self.20260504110432.1002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504110432.1002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 11:05	Success	-	View
exp_self.20260504105648.1001_20260504_105648 Paper: self.20260504105648.1001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504105648.1001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 10:57	Success	-	View
exp_self.20260504104906.1000_20260504_104906 Paper: self.20260504104906.1000	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504104906.1000 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 10:50	Success	-	View
exp_self.20260504104127.999_20260504_104127 Paper: self.20260504104127.999	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504104127.999 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 10:42	Success	-	View
exp_pytrain.20260504103851.249_20260504_103852 Paper: pytrain.20260504103851.249	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 10:39	Success	-	View
exp_self.20260504103249.998_20260504_103250 Paper: self.20260504103249.998	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504103249.998 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 10:33	Success	-	View
exp_self.20260504102508.997_20260504_102508 Paper: self.20260504102508.997	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504102508.997 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 10:26	Success	-	View
exp_self.20260504101731.996_20260504_101731 Paper: self.20260504101731.996	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504101731.996 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 10:18	Success	-	View
exp_hf_2604.27124_20260504_101406 Paper: hf_2604.27124	Better Models, Faster Training: Sigmoid Attention for single-cell Foundation Models Paper ID: hf_2604.27124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-04 10:15	Success	-	View
exp_self.20260504100939.995_20260504_100940 Paper: self.20260504100939.995	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504100939.995 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 10:10	Success	-	View
exp_pytrain.20260504100705.248_20260504_100705 Paper: pytrain.20260504100705.248	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 10:08	Success	-	View
exp_self.20260504100128.994_20260504_100128 Paper: self.20260504100128.994	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504100128.994 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 10:02	Success	-	View
exp_self.20260504095348.993_20260504_095348 Paper: self.20260504095348.993	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504095348.993 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 09:54	Success	-	View
exp_self.20260504094557.992_20260504_094558 Paper: self.20260504094557.992	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504094557.992 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 09:47	Success	-	View
exp_self.20260504093817.991_20260504_093817 Paper: self.20260504093817.991	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504093817.991 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 09:39	Success	-	View
exp_pytrain.20260504093543.247_20260504_093544 Paper: pytrain.20260504093543.247	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 09:36	Success	-	View
exp_self.20260504092942.990_20260504_092942 Paper: self.20260504092942.990	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504092942.990 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 09:30	Success	-	View
exp_self.20260504092201.989_20260504_092201 Paper: self.20260504092201.989	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504092201.989 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 09:23	Success	-	View
exp_self.20260504091424.988_20260504_091424 Paper: self.20260504091424.988	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504091424.988 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 09:15	Success	-	View
exp_self.20260504090648.987_20260504_090648 Paper: self.20260504090648.987	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504090648.987 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 09:07	Success	-	View
exp_pytrain.20260504090407.246_20260504_090408 Paper: pytrain.20260504090407.246	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 09:05	Success	-	View
exp_self.20260504085701.986_20260504_085702 Paper: self.20260504085701.986	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504085701.986 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 08:58	Success	-	View
exp_self.20260504084917.985_20260504_084917 Paper: self.20260504084917.985	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504084917.985 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 08:50	Success	-	View
exp_self.20260504084140.984_20260504_084140 Paper: self.20260504084140.984	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504084140.984 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 08:42	Success	-	View
exp_self.20260504083425.983_20260504_083425 Paper: self.20260504083425.983	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504083425.983 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 08:35	Success	-	View
exp_pytrain.20260504083152.245_20260504_083153 Paper: pytrain.20260504083152.245	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 08:32	Success	-	View
exp_self.20260504082739.982_20260504_082740 Paper: self.20260504082739.982	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504082739.982 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 08:28	Success	-	View
exp_self.20260504081759.981_20260504_081800 Paper: self.20260504081759.981	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504081759.981 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 08:19	Success	-	View
exp_self.20260504081016.980_20260504_081017 Paper: self.20260504081016.980	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504081016.980 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 08:11	Success	-	View
exp_self.20260504080233.979_20260504_080234 Paper: self.20260504080233.979	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504080233.979 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 08:03	Success	-	View
exp_pytrain.20260504080001.244_20260504_080001 Paper: pytrain.20260504080001.244	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 08:01	Success	-	View
exp_self.20260504075251.978_20260504_075251 Paper: self.20260504075251.978	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504075251.978 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 07:53	Success	-	View
exp_self.20260504074517.977_20260504_074517 Paper: self.20260504074517.977	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504074517.977 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 07:46	Success	-	View
exp_self.20260504073739.976_20260504_073740 Paper: self.20260504073739.976	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504073739.976 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 07:38	Success	-	View
exp_self.20260504072955.975_20260504_072955 Paper: self.20260504072955.975	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504072955.975 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 07:30	Success	-	View
exp_pytrain.20260504072720.243_20260504_072720 Paper: pytrain.20260504072720.243	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 07:28	Success	-	View
exp_self.20260504072114.974_20260504_072115 Paper: self.20260504072114.974	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504072114.974 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 07:22	Success	-	View
exp_self.20260504071331.973_20260504_071332 Paper: self.20260504071331.973	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504071331.973 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 07:14	Success	-	View
exp_self.20260504070554.972_20260504_070554 Paper: self.20260504070554.972	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504070554.972 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 07:06	Success	-	View
exp_self.20260504065818.971_20260504_065818 Paper: self.20260504065818.971	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504065818.971 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 06:59	Success	-	View
exp_pytrain.20260504065539.242_20260504_065539 Paper: pytrain.20260504065539.242	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 06:56	Success	-	View
exp_self.20260504064833.970_20260504_064833 Paper: self.20260504064833.970	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504064833.970 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 06:49	Success	-	View
exp_self.20260504064052.969_20260504_064052 Paper: self.20260504064052.969	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504064052.969 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 06:41	Success	-	View
exp_self.20260504063312.968_20260504_063312 Paper: self.20260504063312.968	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504063312.968 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 06:34	Success	-	View
exp_self.20260504062531.967_20260504_062531 Paper: self.20260504062531.967	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504062531.967 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 06:26	Success	-	View
exp_pytrain.20260504062256.241_20260504_062256 Paper: pytrain.20260504062256.241	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 06:24	Success	-	View
exp_self.20260504061655.966_20260504_061655 Paper: self.20260504061655.966	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504061655.966 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 06:17	Success	-	View
exp_self.20260504060918.965_20260504_060918 Paper: self.20260504060918.965	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504060918.965 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 06:10	Success	-	View
exp_self.20260504060142.964_20260504_060142 Paper: self.20260504060142.964	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504060142.964 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 06:02	Success	-	View
exp_self.20260504055400.963_20260504_055400 Paper: self.20260504055400.963	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504055400.963 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 05:55	Success	-	View
exp_pytrain.20260504055119.240_20260504_055120 Paper: pytrain.20260504055119.240	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 05:52	Success	-	View
exp_self.20260504054419.962_20260504_054420 Paper: self.20260504054419.962	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504054419.962 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 05:45	Success	-	View
exp_self.20260504053643.961_20260504_053643 Paper: self.20260504053643.961	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504053643.961 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 05:37	Success	-	View
exp_self.20260504052911.960_20260504_052911 Paper: self.20260504052911.960	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504052911.960 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 05:30	Success	-	View
exp_self.20260504052135.959_20260504_052135 Paper: self.20260504052135.959	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504052135.959 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 05:22	Success	-	View
exp_pytrain.20260504051855.239_20260504_051855 Paper: pytrain.20260504051855.239	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 05:19	Success	-	View
exp_self.20260504051146.958_20260504_051147 Paper: self.20260504051146.958	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504051146.958 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 05:12	Success	-	View
exp_self.20260504050403.957_20260504_050403 Paper: self.20260504050403.957	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504050403.957 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 05:05	Success	-	View
exp_self.20260504045628.956_20260504_045628 Paper: self.20260504045628.956	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504045628.956 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 04:57	Success	-	View
exp_self.20260504044852.955_20260504_044852 Paper: self.20260504044852.955	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504044852.955 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 04:49	Success	-	View
exp_pytrain.20260504044612.238_20260504_044612 Paper: pytrain.20260504044612.238	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 04:47	Success	-	View
exp_self.20260504043913.954_20260504_043914 Paper: self.20260504043913.954	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504043913.954 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 04:40	Success	-	View
exp_self.20260504043127.953_20260504_043128 Paper: self.20260504043127.953	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504043127.953 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 04:32	Success	-	View
exp_self.20260504042346.952_20260504_042347 Paper: self.20260504042346.952	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504042346.952 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 04:24	Success	-	View
exp_self.20260504041612.951_20260504_041612 Paper: self.20260504041612.951	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504041612.951 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 04:17	Success	-	View
exp_pytrain.20260504041335.237_20260504_041335 Paper: pytrain.20260504041335.237	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 04:14	Success	-	View
exp_self.20260504040628.950_20260504_040628 Paper: self.20260504040628.950	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504040628.950 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 04:07	Success	-	View
exp_self.20260504035842.949_20260504_035842 Paper: self.20260504035842.949	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504035842.949 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 03:59	Success	-	View
exp_self.20260504035057.948_20260504_035057 Paper: self.20260504035057.948	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504035057.948 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 03:52	Success	-	View
exp_self.20260504034322.947_20260504_034323 Paper: self.20260504034322.947	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504034322.947 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 03:44	Success	-	View
exp_pytrain.20260504034050.236_20260504_034050 Paper: pytrain.20260504034050.236	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 03:41	Success	-	View
exp_self.20260504033452.946_20260504_033452 Paper: self.20260504033452.946	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504033452.946 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 03:35	Success	-	View
exp_self.20260504032713.945_20260504_032713 Paper: self.20260504032713.945	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504032713.945 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 03:28	Success	-	View
exp_self.20260504031941.944_20260504_031941 Paper: self.20260504031941.944	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504031941.944 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 03:20	Success	-	View
exp_self.20260504031203.943_20260504_031204 Paper: self.20260504031203.943	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504031203.943 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 03:13	Success	-	View
exp_pytrain.20260504030924.235_20260504_030924 Paper: pytrain.20260504030924.235	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 03:10	Success	-	View
exp_self.20260504030400.942_20260504_030401 Paper: self.20260504030400.942	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504030400.942 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 03:05	Success	-	View
exp_hf_2604.23586_20260504_030038 Paper: hf_2604.23586	Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling Paper ID: hf_2604.23586 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-04 03:01	Success	-	View
exp_self.20260504025506.941_20260504_025507 Paper: self.20260504025506.941	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504025506.941 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 02:56	Success	-	View
exp_self.20260504024728.940_20260504_024728 Paper: self.20260504024728.940	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504024728.940 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 02:48	Success	-	View
exp_self.20260504023954.939_20260504_023955 Paper: self.20260504023954.939	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504023954.939 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 02:40	Success	-	View
exp_pytrain.20260504023725.234_20260504_023725 Paper: pytrain.20260504023725.234	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 02:38	Success	-	View
exp_self.20260504023152.938_20260504_023153 Paper: self.20260504023152.938	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504023152.938 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 02:32	Success	-	View
exp_self.20260504022414.937_20260504_022414 Paper: self.20260504022414.937	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504022414.937 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 02:25	Success	-	View
exp_self.20260504021643.936_20260504_021643 Paper: self.20260504021643.936	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504021643.936 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 02:17	Success	-	View
exp_self.20260504020908.935_20260504_020908 Paper: self.20260504020908.935	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504020908.935 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 02:10	Success	-	View
exp_pytrain.20260504020600.233_20260504_020600 Paper: pytrain.20260504020600.233	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 02:07	Success	-	View
exp_self.20260504015856.934_20260504_015856 Paper: self.20260504015856.934	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504015856.934 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 01:59	Success	-	View
exp_self.20260504015113.933_20260504_015113 Paper: self.20260504015113.933	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504015113.933 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 01:52	Success	-	View
exp_hf_2605.00323_20260504_014639 Paper: hf_2605.00323	Online Self-Calibration Against Hallucination in Vision-Language Models Paper ID: hf_2605.00323 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-04 01:47	Success	-	View
exp_self.20260504014431.932_20260504_014431 Paper: self.20260504014431.932	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504014431.932 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 01:45	Success	-	View
exp_self.20260504013700.931_20260504_013701 Paper: self.20260504013700.931	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504013700.931 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 01:38	Success	-	View
exp_pytrain.20260504013421.232_20260504_013421 Paper: pytrain.20260504013421.232	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 01:35	Success	-	View
exp_self.20260504012724.930_20260504_012725 Paper: self.20260504012724.930	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504012724.930 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 01:28	Success	-	View
exp_self.20260504011951.929_20260504_011952 Paper: self.20260504011951.929	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504011951.929 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 01:20	Success	-	View
exp_self.20260504011218.928_20260504_011218 Paper: self.20260504011218.928	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504011218.928 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 01:13	Success	-	View
exp_self.20260504010447.927_20260504_010447 Paper: self.20260504010447.927	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504010447.927 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 01:05	Success	-	View
exp_pytrain.20260504010208.231_20260504_010209 Paper: pytrain.20260504010208.231	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 01:03	Success	-	View
exp_hf_2605.00691_20260504_005919 Paper: hf_2605.00691	Learning to Act and Cooperate for Distributed Black-Box Consensus Optimization Paper ID: hf_2605.00691 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-04 01:00	Success	-	View
exp_self.20260504005454.926_20260504_005455 Paper: self.20260504005454.926	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504005454.926 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 00:55	Success	-	View
exp_self.20260504004722.925_20260504_004723 Paper: self.20260504004722.925	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504004722.925 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 00:48	Success	-	View
exp_self.20260504003950.924_20260504_003950 Paper: self.20260504003950.924	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504003950.924 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 00:40	Success	-	View
exp_self.20260504003211.923_20260504_003211 Paper: self.20260504003211.923	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504003211.923 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 00:33	Success	-	View
exp_pytrain.20260504002933.230_20260504_002934 Paper: pytrain.20260504002933.230	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-04 00:30	Success	-	View
exp_self.20260504002229.922_20260504_002229 Paper: self.20260504002229.922	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504002229.922 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 00:23	Success	-	View
exp_self.20260504001451.921_20260504_001452 Paper: self.20260504001451.921	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504001451.921 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 00:15	Success	-	View
exp_self.20260504000713.920_20260504_000713 Paper: self.20260504000713.920	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260504000713.920 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 00:08	Success	-	View
exp_self.20260503235926.919_20260503_235926 Paper: self.20260503235926.919	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503235926.919 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-04 00:00	Success	-	View
exp_pytrain.20260503235654.229_20260503_235654 Paper: pytrain.20260503235654.229	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 23:57	Success	-	View
exp_self.20260503234951.918_20260503_234951 Paper: self.20260503234951.918	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503234951.918 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 23:50	Success	-	View
exp_self.20260503234216.917_20260503_234216 Paper: self.20260503234216.917	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503234216.917 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 23:43	Success	-	View
exp_self.20260503233444.916_20260503_233444 Paper: self.20260503233444.916	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503233444.916 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 23:35	Success	-	View
exp_self.20260503232710.915_20260503_232710 Paper: self.20260503232710.915	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503232710.915 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 23:28	Success	-	View
exp_pytrain.20260503232432.228_20260503_232432 Paper: pytrain.20260503232432.228	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 23:25	Success	-	View
exp_self.20260503232014.914_20260503_232014 Paper: self.20260503232014.914	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503232014.914 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 23:21	Success	-	View
exp_self.20260503231240.913_20260503_231241 Paper: self.20260503231240.913	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503231240.913 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 23:13	Success	-	View
exp_self.20260503230508.912_20260503_230508 Paper: self.20260503230508.912	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503230508.912 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 23:06	Success	-	View
exp_hf_2604.23195_20260503_230212 Paper: hf_2604.23195	AnalogRetriever: Learning Cross-Modal Representations for Analog Circuit Retrieval Paper ID: hf_2604.23195 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-03 23:03	Success	-	View
exp_self.20260503225503.911_20260503_225503 Paper: self.20260503225503.911	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503225503.911 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 22:56	Success	-	View
exp_pytrain.20260503225231.227_20260503_225232 Paper: pytrain.20260503225231.227	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 22:53	Success	-	View
exp_self.20260503224524.910_20260503_224525 Paper: self.20260503224524.910	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503224524.910 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 22:46	Success	-	View
exp_self.20260503223749.909_20260503_223749 Paper: self.20260503223749.909	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503223749.909 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 22:38	Success	-	View
exp_self.20260503223015.908_20260503_223016 Paper: self.20260503223015.908	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503223015.908 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 22:31	Success	-	View
exp_self.20260503222243.907_20260503_222243 Paper: self.20260503222243.907	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503222243.907 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 22:23	Success	-	View
exp_pytrain.20260503222009.226_20260503_222009 Paper: pytrain.20260503222009.226	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 22:21	Success	-	View
exp_self.20260503221304.906_20260503_221305 Paper: self.20260503221304.906	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503221304.906 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 22:14	Success	-	View
exp_self.20260503220535.905_20260503_220535 Paper: self.20260503220535.905	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503220535.905 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 22:06	Success	-	View
exp_self.20260503215803.904_20260503_215803 Paper: self.20260503215803.904	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503215803.904 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 21:59	Success	-	View
exp_self.20260503215023.903_20260503_215024 Paper: self.20260503215023.903	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503215023.903 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 21:51	Success	-	View
exp_pytrain.20260503214751.225_20260503_214751 Paper: pytrain.20260503214751.225	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 21:48	Success	-	View
exp_self.20260503214045.902_20260503_214045 Paper: self.20260503214045.902	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503214045.902 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 21:41	Success	-	View
exp_self.20260503213312.901_20260503_213313 Paper: self.20260503213312.901	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503213312.901 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 21:34	Success	-	View
exp_self.20260503212540.900_20260503_212540 Paper: self.20260503212540.900	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503212540.900 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 21:26	Success	-	View
exp_self.20260503211803.899_20260503_211803 Paper: self.20260503211803.899	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503211803.899 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 21:19	Success	-	View
exp_pytrain.20260503211529.224_20260503_211530 Paper: pytrain.20260503211529.224	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 21:16	Success	-	View
exp_self.20260503211006.898_20260503_211006 Paper: self.20260503211006.898	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503211006.898 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 21:11	Success	-	View
exp_self.20260503210229.897_20260503_210230 Paper: self.20260503210229.897	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503210229.897 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 21:03	Success	-	View
exp_2605.00814v1_20260503_205911 Paper: 2605.00814v1	Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs Paper ID: 2605.00814v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	05-03 21:00	Success	-	View
exp_self.20260503205449.896_20260503_205449 Paper: self.20260503205449.896	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503205449.896 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 20:55	Success	-	View
exp_hf_2605.00658_20260503_205126 Paper: hf_2605.00658	UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper ID: hf_2605.00658 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-03 20:52	Success	-	View
exp_self.20260503204553.895_20260503_204554 Paper: self.20260503204553.895	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503204553.895 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 20:46	Success	-	View
exp_pytrain.20260503204320.223_20260503_204320 Paper: pytrain.20260503204320.223	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 20:44	Success	-	View
exp_self.20260503203616.894_20260503_203616 Paper: self.20260503203616.894	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503203616.894 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 20:37	Success	-	View
exp_self.20260503202843.893_20260503_202844 Paper: self.20260503202843.893	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503202843.893 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 20:29	Success	-	View
exp_self.20260503202113.892_20260503_202113 Paper: self.20260503202113.892	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503202113.892 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 20:22	Success	-	View
exp_self.20260503201341.891_20260503_201341 Paper: self.20260503201341.891	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503201341.891 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 20:14	Success	-	View
exp_pytrain.20260503201103.222_20260503_201104 Paper: pytrain.20260503201103.222	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 20:12	Success	-	View
exp_self.20260503200406.890_20260503_200407 Paper: self.20260503200406.890	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503200406.890 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 20:05	Success	-	View
exp_self.20260503195634.889_20260503_195634 Paper: self.20260503195634.889	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503195634.889 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 19:57	Success	-	View
exp_self.20260503194904.888_20260503_194904 Paper: self.20260503194904.888	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503194904.888 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 19:50	Success	-	View
exp_self.20260503194129.887_20260503_194130 Paper: self.20260503194129.887	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503194129.887 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 19:42	Success	-	View
exp_pytrain.20260503193852.221_20260503_193853 Paper: pytrain.20260503193852.221	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 19:39	Success	-	View
exp_self.20260503193156.886_20260503_193157 Paper: self.20260503193156.886	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503193156.886 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 19:32	Success	-	View
exp_self.20260503192418.885_20260503_192419 Paper: self.20260503192418.885	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503192418.885 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 19:25	Success	-	View
exp_self.20260503191649.884_20260503_191649 Paper: self.20260503191649.884	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503191649.884 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 19:17	Success	-	View
exp_self.20260503190920.883_20260503_190920 Paper: self.20260503190920.883	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503190920.883 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 19:10	Success	-	View
exp_pytrain.20260503190644.220_20260503_190644 Paper: pytrain.20260503190644.220	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 19:07	Success	-	View
exp_self.20260503185946.882_20260503_185946 Paper: self.20260503185946.882	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503185946.882 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 19:00	Success	-	View
exp_self.20260503185212.881_20260503_185212 Paper: self.20260503185212.881	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503185212.881 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 18:53	Success	-	View
exp_self.20260503184437.880_20260503_184438 Paper: self.20260503184437.880	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503184437.880 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 18:45	Success	-	View
exp_self.20260503183708.879_20260503_183709 Paper: self.20260503183708.879	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503183708.879 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 18:38	Success	-	View
exp_pytrain.20260503183433.219_20260503_183434 Paper: pytrain.20260503183433.219	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 18:35	Success	-	View
exp_self.20260503182907.878_20260503_182907 Paper: self.20260503182907.878	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503182907.878 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 18:30	Success	-	View
exp_self.20260503182136.877_20260503_182136 Paper: self.20260503182136.877	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503182136.877 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 18:22	Success	-	View
exp_self.20260503181358.876_20260503_181358 Paper: self.20260503181358.876	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503181358.876 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 18:15	Success	-	View
exp_self.20260503180557.875_20260503_180557 Paper: self.20260503180557.875	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503180557.875 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 18:07	Success	-	View
exp_pytrain.20260503180301.218_20260503_180301 Paper: pytrain.20260503180301.218	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 18:04	Success	-	View
exp_self.20260503175722.874_20260503_175722 Paper: self.20260503175722.874	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503175722.874 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 17:58	Success	-	View
exp_self.20260503174921.873_20260503_174921 Paper: self.20260503174921.873	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503174921.873 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 17:50	Success	-	View
exp_self.20260503174135.872_20260503_174136 Paper: self.20260503174135.872	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503174135.872 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 17:42	Success	-	View
exp_self.20260503173346.871_20260503_173347 Paper: self.20260503173346.871	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503173346.871 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 17:34	Success	-	View
exp_pytrain.20260503173050.217_20260503_173051 Paper: pytrain.20260503173050.217	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 17:31	Success	-	View
exp_self.20260503172348.870_20260503_172348 Paper: self.20260503172348.870	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503172348.870 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 17:24	Success	-	View
exp_self.20260503171613.869_20260503_171613 Paper: self.20260503171613.869	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503171613.869 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 17:17	Success	-	View
exp_self.20260503170839.868_20260503_170840 Paper: self.20260503170839.868	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503170839.868 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 17:09	Success	-	View
exp_self.20260503170106.867_20260503_170107 Paper: self.20260503170106.867	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503170106.867 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 17:02	Success	-	View
exp_pytrain.20260503165833.216_20260503_165833 Paper: pytrain.20260503165833.216	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 16:59	Success	-	View
exp_self.20260503165136.866_20260503_165137 Paper: self.20260503165136.866	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503165136.866 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 16:52	Success	-	View
exp_self.20260503164359.865_20260503_164359 Paper: self.20260503164359.865	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503164359.865 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 16:45	Success	-	View
exp_self.20260503163609.864_20260503_163610 Paper: self.20260503163609.864	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503163609.864 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 16:37	Success	-	View
exp_self.20260503162838.863_20260503_162839 Paper: self.20260503162838.863	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503162838.863 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 16:29	Success	-	View
exp_pytrain.20260503162555.215_20260503_162556 Paper: pytrain.20260503162555.215	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 16:26	Success	-	View
exp_self.20260503161853.862_20260503_161853 Paper: self.20260503161853.862	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503161853.862 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 16:19	Success	-	View
exp_self.20260503161120.861_20260503_161121 Paper: self.20260503161120.861	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503161120.861 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 16:12	Success	-	View
exp_self.20260503160350.860_20260503_160350 Paper: self.20260503160350.860	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503160350.860 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 16:04	Success	-	View
exp_self.20260503155619.859_20260503_155620 Paper: self.20260503155619.859	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503155619.859 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 15:57	Success	-	View
exp_pytrain.20260503155338.214_20260503_155339 Paper: pytrain.20260503155338.214	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 15:54	Success	-	View
exp_self.20260503154644.858_20260503_154644 Paper: self.20260503154644.858	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503154644.858 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 15:47	Success	-	View
exp_self.20260503153908.857_20260503_153909 Paper: self.20260503153908.857	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503153908.857 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 15:40	Success	-	View
exp_self.20260503153130.856_20260503_153130 Paper: self.20260503153130.856	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503153130.856 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 15:32	Success	-	View
exp_self.20260503152359.855_20260503_152400 Paper: self.20260503152359.855	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503152359.855 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 15:25	Success	-	View
exp_pytrain.20260503152124.213_20260503_152125 Paper: pytrain.20260503152124.213	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 15:22	Success	-	View
exp_self.20260503151414.854_20260503_151414 Paper: self.20260503151414.854	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503151414.854 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 15:15	Success	-	View
exp_self.20260503150633.853_20260503_150633 Paper: self.20260503150633.853	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503150633.853 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 15:07	Success	-	View
exp_self.20260503145845.852_20260503_145845 Paper: self.20260503145845.852	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503145845.852 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 14:59	Success	-	View
exp_self.20260503145114.851_20260503_145114 Paper: self.20260503145114.851	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503145114.851 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 14:52	Success	-	View
exp_pytrain.20260503144843.212_20260503_144844 Paper: pytrain.20260503144843.212	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 14:49	Success	-	View
exp_self.20260503144145.850_20260503_144146 Paper: self.20260503144145.850	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503144145.850 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 14:42	Success	-	View
exp_self.20260503143410.849_20260503_143410 Paper: self.20260503143410.849	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503143410.849 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 14:35	Success	-	View
exp_self.20260503142634.848_20260503_142635 Paper: self.20260503142634.848	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503142634.848 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 14:27	Success	-	View
exp_self.20260503141857.847_20260503_141857 Paper: self.20260503141857.847	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503141857.847 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 14:20	Success	-	View
exp_pytrain.20260503141622.211_20260503_141622 Paper: pytrain.20260503141622.211	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 14:17	Success	-	View
exp_self.20260503140917.846_20260503_140917 Paper: self.20260503140917.846	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503140917.846 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 14:10	Success	-	View
exp_self.20260503140147.845_20260503_140147 Paper: self.20260503140147.845	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503140147.845 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 14:02	Success	-	View
exp_self.20260503135413.844_20260503_135414 Paper: self.20260503135413.844	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503135413.844 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 13:55	Success	-	View
exp_self.20260503134643.843_20260503_134643 Paper: self.20260503134643.843	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503134643.843 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 13:47	Success	-	View
exp_pytrain.20260503134412.210_20260503_134413 Paper: pytrain.20260503134412.210	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 13:45	Success	-	View
exp_self.20260503133709.842_20260503_133710 Paper: self.20260503133709.842	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503133709.842 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 13:38	Success	-	View
exp_self.20260503132940.841_20260503_132940 Paper: self.20260503132940.841	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503132940.841 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 13:30	Success	-	View
exp_self.20260503132208.840_20260503_132209 Paper: self.20260503132208.840	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503132208.840 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 13:23	Success	-	View
exp_self.20260503131436.839_20260503_131436 Paper: self.20260503131436.839	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503131436.839 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 13:15	Success	-	View
exp_pytrain.20260503131204.209_20260503_131204 Paper: pytrain.20260503131204.209	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 13:13	Success	-	View
exp_self.20260503130500.838_20260503_130500 Paper: self.20260503130500.838	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503130500.838 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 13:06	Success	-	View
exp_self.20260503125732.837_20260503_125732 Paper: self.20260503125732.837	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503125732.837 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 12:58	Success	-	View
exp_self.20260503125002.836_20260503_125003 Paper: self.20260503125002.836	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503125002.836 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 12:51	Success	-	View
exp_self.20260503124230.835_20260503_124230 Paper: self.20260503124230.835	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503124230.835 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 12:43	Success	-	View
exp_pytrain.20260503123957.208_20260503_123957 Paper: pytrain.20260503123957.208	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 12:40	Success	-	View
exp_self.20260503123300.834_20260503_123300 Paper: self.20260503123300.834	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503123300.834 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 12:34	Success	-	View
exp_self.20260503122529.833_20260503_122530 Paper: self.20260503122529.833	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503122529.833 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 12:26	Success	-	View
exp_self.20260503121757.832_20260503_121757 Paper: self.20260503121757.832	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503121757.832 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 12:19	Success	-	View
exp_self.20260503121026.831_20260503_121026 Paper: self.20260503121026.831	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503121026.831 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 12:11	Success	-	View
exp_pytrain.20260503120749.207_20260503_120749 Paper: pytrain.20260503120749.207	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 12:08	Success	-	View
exp_self.20260503120049.830_20260503_120049 Paper: self.20260503120049.830	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503120049.830 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 12:01	Success	-	View
exp_self.20260503115317.829_20260503_115317 Paper: self.20260503115317.829	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503115317.829 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 11:54	Success	-	View
exp_self.20260503114548.828_20260503_114549 Paper: self.20260503114548.828	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503114548.828 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 11:46	Success	-	View
exp_self.20260503113818.827_20260503_113818 Paper: self.20260503113818.827	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503113818.827 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 11:39	Success	-	View
exp_pytrain.20260503113537.206_20260503_113538 Paper: pytrain.20260503113537.206	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 11:36	Success	-	View
exp_self.20260503112841.826_20260503_112841 Paper: self.20260503112841.826	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503112841.826 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 11:29	Success	-	View
exp_self.20260503112111.825_20260503_112111 Paper: self.20260503112111.825	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503112111.825 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 11:22	Success	-	View
exp_self.20260503111333.824_20260503_111333 Paper: self.20260503111333.824	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503111333.824 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 11:14	Success	-	View
exp_self.20260503110600.823_20260503_110600 Paper: self.20260503110600.823	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503110600.823 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 11:07	Success	-	View
exp_pytrain.20260503110322.205_20260503_110322 Paper: pytrain.20260503110322.205	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 11:04	Success	-	View
exp_self.20260503105625.822_20260503_105626 Paper: self.20260503105625.822	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503105625.822 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 10:57	Success	-	View
exp_self.20260503104847.821_20260503_104847 Paper: self.20260503104847.821	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503104847.821 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 10:49	Success	-	View
exp_self.20260503104115.820_20260503_104115 Paper: self.20260503104115.820	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503104115.820 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 10:42	Success	-	View
exp_self.20260503103341.819_20260503_103341 Paper: self.20260503103341.819	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503103341.819 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 10:34	Success	-	View
exp_pytrain.20260503103106.204_20260503_103107 Paper: pytrain.20260503103106.204	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 10:32	Success	-	View
exp_self.20260503102407.818_20260503_102408 Paper: self.20260503102407.818	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503102407.818 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 10:25	Success	-	View
exp_self.20260503101627.817_20260503_101627 Paper: self.20260503101627.817	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503101627.817 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 10:17	Success	-	View
exp_self.20260503100853.816_20260503_100854 Paper: self.20260503100853.816	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503100853.816 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 10:09	Success	-	View
exp_self.20260503100123.815_20260503_100123 Paper: self.20260503100123.815	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503100123.815 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 10:02	Success	-	View
exp_pytrain.20260503095852.203_20260503_095852 Paper: pytrain.20260503095852.203	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 09:59	Success	-	View
exp_self.20260503095253.814_20260503_095253 Paper: self.20260503095253.814	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503095253.814 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 09:53	Success	-	View
exp_self.20260503094519.813_20260503_094519 Paper: self.20260503094519.813	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503094519.813 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 09:46	Success	-	View
exp_self.20260503093750.812_20260503_093750 Paper: self.20260503093750.812	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503093750.812 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 09:38	Success	-	View
exp_self.20260503093010.811_20260503_093011 Paper: self.20260503093010.811	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503093010.811 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 09:31	Success	-	View
exp_pytrain.20260503092729.202_20260503_092730 Paper: pytrain.20260503092729.202	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 09:28	Success	-	View
exp_self.20260503092034.810_20260503_092034 Paper: self.20260503092034.810	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503092034.810 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 09:21	Success	-	View
exp_self.20260503091246.809_20260503_091246 Paper: self.20260503091246.809	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503091246.809 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 09:13	Success	-	View
exp_self.20260503090509.808_20260503_090509 Paper: self.20260503090509.808	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503090509.808 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 09:06	Success	-	View
exp_self.20260503085736.807_20260503_085736 Paper: self.20260503085736.807	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503085736.807 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 08:58	Success	-	View
exp_pytrain.20260503085454.201_20260503_085454 Paper: pytrain.20260503085454.201	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 08:55	Success	-	View
exp_self.20260503084759.806_20260503_084759 Paper: self.20260503084759.806	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503084759.806 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 08:49	Success	-	View
exp_self.20260503084020.805_20260503_084021 Paper: self.20260503084020.805	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503084020.805 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 08:41	Success	-	View
exp_self.20260503083250.804_20260503_083251 Paper: self.20260503083250.804	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503083250.804 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 08:33	Success	-	View
exp_self.20260503082521.803_20260503_082522 Paper: self.20260503082521.803	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503082521.803 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 08:26	Success	-	View
exp_pytrain.20260503082247.200_20260503_082247 Paper: pytrain.20260503082247.200	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 08:23	Success	-	View
exp_self.20260503081553.802_20260503_081553 Paper: self.20260503081553.802	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503081553.802 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 08:16	Success	-	View
exp_self.20260503080816.801_20260503_080816 Paper: self.20260503080816.801	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503080816.801 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 08:09	Success	-	View
exp_self.20260503080042.800_20260503_080043 Paper: self.20260503080042.800	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503080042.800 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 08:01	Success	-	View
exp_self.20260503075313.799_20260503_075313 Paper: self.20260503075313.799	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503075313.799 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 07:54	Success	-	View
exp_pytrain.20260503075039.199_20260503_075039 Paper: pytrain.20260503075039.199	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 07:51	Success	-	View
exp_self.20260503074346.798_20260503_074347 Paper: self.20260503074346.798	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503074346.798 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 07:44	Success	-	View
exp_gh_tamimmirza_hallueval_20260503_073921 Paper: gh_tamimmirza_hallueval	tamimmirza/hallueval Paper ID: gh_tamimmirza_hallueval - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 07:40	Success	-	View
exp_self.20260503073607.797_20260503_073607 Paper: self.20260503073607.797	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503073607.797 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 07:37	Success	-	View
exp_self.20260503072832.796_20260503_072833 Paper: self.20260503072832.796	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503072832.796 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 07:29	Success	-	View
exp_self.20260503072055.795_20260503_072055 Paper: self.20260503072055.795	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503072055.795 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 07:21	Success	-	View
exp_pytrain.20260503071821.198_20260503_071822 Paper: pytrain.20260503071821.198	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 07:19	Success	-	View
exp_self.20260503071119.794_20260503_071119 Paper: self.20260503071119.794	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503071119.794 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 07:12	Success	-	View
exp_self.20260503070351.793_20260503_070352 Paper: self.20260503070351.793	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503070351.793 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 07:04	Success	-	View
exp_self.20260503065623.792_20260503_065624 Paper: self.20260503065623.792	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503065623.792 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 06:57	Success	-	View
exp_self.20260503064841.791_20260503_064841 Paper: self.20260503064841.791	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503064841.791 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 06:49	Success	-	View
exp_pytrain.20260503064606.197_20260503_064606 Paper: pytrain.20260503064606.197	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 06:47	Success	-	View
exp_self.20260503063902.790_20260503_063902 Paper: self.20260503063902.790	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503063902.790 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 06:40	Success	-	View
exp_self.20260503063127.789_20260503_063128 Paper: self.20260503063127.789	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503063127.789 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 06:32	Success	-	View
exp_self.20260503062355.788_20260503_062355 Paper: self.20260503062355.788	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503062355.788 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 06:24	Success	-	View
exp_self.20260503061625.787_20260503_061626 Paper: self.20260503061625.787	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503061625.787 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 06:17	Success	-	View
exp_pytrain.20260503061339.196_20260503_061339 Paper: pytrain.20260503061339.196	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 06:14	Success	-	View
exp_self.20260503060648.786_20260503_060648 Paper: self.20260503060648.786	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503060648.786 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 06:07	Success	-	View
exp_self.20260503055921.785_20260503_055922 Paper: self.20260503055921.785	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503055921.785 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 06:00	Success	-	View
exp_self.20260503055156.784_20260503_055156 Paper: self.20260503055156.784	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503055156.784 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 05:52	Success	-	View
exp_self.20260503054428.783_20260503_054429 Paper: self.20260503054428.783	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503054428.783 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 05:45	Success	-	View
exp_pytrain.20260503054158.195_20260503_054158 Paper: pytrain.20260503054158.195	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 05:43	Success	-	View
exp_self.20260503053509.782_20260503_053509 Paper: self.20260503053509.782	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503053509.782 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 05:36	Success	-	View
exp_self.20260503052738.781_20260503_052738 Paper: self.20260503052738.781	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503052738.781 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 05:28	Success	-	View
exp_self.20260503052009.780_20260503_052010 Paper: self.20260503052009.780	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503052009.780 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 05:21	Success	-	View
exp_self.20260503051242.779_20260503_051242 Paper: self.20260503051242.779	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503051242.779 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 05:13	Success	-	View
exp_pytrain.20260503051011.194_20260503_051012 Paper: pytrain.20260503051011.194	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 05:11	Success	-	View
exp_self.20260503050323.778_20260503_050323 Paper: self.20260503050323.778	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503050323.778 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 05:04	Success	-	View
exp_self.20260503045550.777_20260503_045550 Paper: self.20260503045550.777	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503045550.777 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 04:56	Success	-	View
exp_self.20260503044822.776_20260503_044823 Paper: self.20260503044822.776	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503044822.776 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 04:49	Success	-	View
exp_self.20260503044056.775_20260503_044056 Paper: self.20260503044056.775	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503044056.775 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 04:41	Success	-	View
exp_pytrain.20260503043829.193_20260503_043830 Paper: pytrain.20260503043829.193	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 04:39	Success	-	View
exp_self.20260503043136.774_20260503_043137 Paper: self.20260503043136.774	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503043136.774 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 04:32	Success	-	View
exp_self.20260503042411.773_20260503_042412 Paper: self.20260503042411.773	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503042411.773 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 04:25	Success	-	View
exp_self.20260503041641.772_20260503_041642 Paper: self.20260503041641.772	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503041641.772 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 04:17	Success	-	View
exp_self.20260503040915.771_20260503_040915 Paper: self.20260503040915.771	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503040915.771 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 04:10	Success	-	View
exp_pytrain.20260503040650.192_20260503_040651 Paper: pytrain.20260503040650.192	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 04:07	Success	-	View
exp_self.20260503035949.770_20260503_035949 Paper: self.20260503035949.770	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503035949.770 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 04:00	Success	-	View
exp_self.20260503035220.769_20260503_035221 Paper: self.20260503035220.769	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503035220.769 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 03:53	Success	-	View
exp_self.20260503034448.768_20260503_034448 Paper: self.20260503034448.768	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503034448.768 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 03:45	Success	-	View
exp_self.20260503033717.767_20260503_033717 Paper: self.20260503033717.767	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503033717.767 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 03:38	Success	-	View
exp_pytrain.20260503033451.191_20260503_033451 Paper: pytrain.20260503033451.191	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 03:35	Success	-	View
exp_self.20260503032752.766_20260503_032752 Paper: self.20260503032752.766	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503032752.766 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 03:28	Success	-	View
exp_self.20260503032016.765_20260503_032016 Paper: self.20260503032016.765	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503032016.765 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 03:21	Success	-	View
exp_self.20260503031249.764_20260503_031249 Paper: self.20260503031249.764	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503031249.764 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 03:13	Success	-	View
exp_self.20260503030514.763_20260503_030514 Paper: self.20260503030514.763	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503030514.763 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 03:06	Success	-	View
exp_pytrain.20260503030247.190_20260503_030247 Paper: pytrain.20260503030247.190	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 03:03	Success	-	View
exp_self.20260503025546.762_20260503_025546 Paper: self.20260503025546.762	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503025546.762 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 02:56	Success	-	View
exp_self.20260503024822.761_20260503_024823 Paper: self.20260503024822.761	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503024822.761 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 02:49	Success	-	View
exp_self.20260503024057.760_20260503_024057 Paper: self.20260503024057.760	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503024057.760 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 02:42	Success	-	View
exp_self.20260503023326.759_20260503_023326 Paper: self.20260503023326.759	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503023326.759 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 02:34	Success	-	View
exp_pytrain.20260503023058.189_20260503_023058 Paper: pytrain.20260503023058.189	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 02:32	Success	-	View
exp_self.20260503022400.758_20260503_022400 Paper: self.20260503022400.758	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503022400.758 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 02:25	Success	-	View
exp_self.20260503021635.757_20260503_021636 Paper: self.20260503021635.757	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503021635.757 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 02:17	Success	-	View
exp_self.20260503020907.756_20260503_020908 Paper: self.20260503020907.756	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503020907.756 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 02:10	Success	-	View
exp_self.20260503020140.755_20260503_020141 Paper: self.20260503020140.755	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503020140.755 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 02:02	Success	-	View
exp_pytrain.20260503015909.188_20260503_015909 Paper: pytrain.20260503015909.188	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 02:00	Success	-	View
exp_self.20260503015210.754_20260503_015210 Paper: self.20260503015210.754	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503015210.754 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 01:53	Success	-	View
exp_self.20260503014442.753_20260503_014442 Paper: self.20260503014442.753	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503014442.753 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 01:45	Success	-	View
exp_self.20260503013720.752_20260503_013720 Paper: self.20260503013720.752	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503013720.752 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 01:38	Success	-	View
exp_self.20260503012953.751_20260503_012953 Paper: self.20260503012953.751	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503012953.751 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 01:30	Success	-	View
exp_pytrain.20260503012718.187_20260503_012718 Paper: pytrain.20260503012718.187	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 01:28	Success	-	View
exp_self.20260503012026.750_20260503_012026 Paper: self.20260503012026.750	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503012026.750 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 01:21	Success	-	View
exp_self.20260503011300.749_20260503_011300 Paper: self.20260503011300.749	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503011300.749 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 01:14	Success	-	View
exp_self.20260503010531.748_20260503_010531 Paper: self.20260503010531.748	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503010531.748 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 01:06	Success	-	View
exp_self.20260503005757.747_20260503_005758 Paper: self.20260503005757.747	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503005757.747 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 00:59	Success	-	View
exp_pytrain.20260503005526.186_20260503_005526 Paper: pytrain.20260503005526.186	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 00:56	Success	-	View
exp_self.20260503005005.746_20260503_005005 Paper: self.20260503005005.746	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503005005.746 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 00:51	Success	-	View
exp_gh_divyamhi_longbench-diagnostics_20260503_004652 Paper: gh_divyamhi_longbench-diagnostics	divyamhi/longbench-diagnostics Paper ID: gh_divyamhi_longbench-diagnostics - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal:...	05-03 00:47	Success	-	View
exp_self.20260503004127.745_20260503_004127 Paper: self.20260503004127.745	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503004127.745 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 00:42	Success	-	View
exp_self.20260503003402.744_20260503_003403 Paper: self.20260503003402.744	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503003402.744 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 00:35	Success	-	View
exp_self.20260503002631.743_20260503_002632 Paper: self.20260503002631.743	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503002631.743 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 00:27	Success	-	View
exp_pytrain.20260503002405.185_20260503_002406 Paper: pytrain.20260503002405.185	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-03 00:25	Success	-	View
exp_self.20260503001706.742_20260503_001707 Paper: self.20260503001706.742	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503001706.742 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 00:18	Success	-	View
exp_self.20260503000932.741_20260503_000933 Paper: self.20260503000932.741	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503000932.741 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 00:10	Success	-	View
exp_self.20260503000118.740_20260503_000119 Paper: self.20260503000118.740	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260503000118.740 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-03 00:02	Success	-	View
exp_self.20260502235356.739_20260502_235357 Paper: self.20260502235356.739	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502235356.739 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 23:54	Success	-	View
exp_pytrain.20260502235126.184_20260502_235126 Paper: pytrain.20260502235126.184	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 23:52	Success	-	View
exp_self.20260502234437.738_20260502_234438 Paper: self.20260502234437.738	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502234437.738 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 23:45	Success	-	View
exp_self.20260502233709.737_20260502_233709 Paper: self.20260502233709.737	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502233709.737 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 23:38	Success	-	View
exp_self.20260502232940.736_20260502_232941 Paper: self.20260502232940.736	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502232940.736 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 23:30	Success	-	View
exp_self.20260502232213.735_20260502_232214 Paper: self.20260502232213.735	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502232213.735 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 23:23	Success	-	View
exp_pytrain.20260502231947.183_20260502_231947 Paper: pytrain.20260502231947.183	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 23:20	Success	-	View
exp_self.20260502231247.734_20260502_231247 Paper: self.20260502231247.734	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502231247.734 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 23:13	Success	-	View
exp_self.20260502230522.733_20260502_230522 Paper: self.20260502230522.733	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502230522.733 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 23:06	Success	-	View
exp_self.20260502225752.732_20260502_225752 Paper: self.20260502225752.732	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502225752.732 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 22:58	Success	-	View
exp_self.20260502225016.731_20260502_225016 Paper: self.20260502225016.731	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502225016.731 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 22:51	Success	-	View
exp_pytrain.20260502224751.182_20260502_224752 Paper: pytrain.20260502224751.182	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 22:48	Success	-	View
exp_self.20260502224054.730_20260502_224055 Paper: self.20260502224054.730	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502224054.730 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 22:41	Success	-	View
exp_self.20260502223330.729_20260502_223331 Paper: self.20260502223330.729	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502223330.729 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 22:34	Success	-	View
exp_self.20260502222602.728_20260502_222602 Paper: self.20260502222602.728	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502222602.728 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 22:27	Success	-	View
exp_self.20260502221834.727_20260502_221834 Paper: self.20260502221834.727	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502221834.727 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 22:19	Success	-	View
exp_pytrain.20260502221608.181_20260502_221608 Paper: pytrain.20260502221608.181	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 22:17	Success	-	View
exp_self.20260502220916.726_20260502_220917 Paper: self.20260502220916.726	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502220916.726 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 22:10	Success	-	View
exp_self.20260502220153.725_20260502_220154 Paper: self.20260502220153.725	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502220153.725 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 22:02	Success	-	View
exp_self.20260502215429.724_20260502_215429 Paper: self.20260502215429.724	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502215429.724 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 21:55	Success	-	View
exp_self.20260502214702.723_20260502_214703 Paper: self.20260502214702.723	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502214702.723 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 21:48	Success	-	View
exp_pytrain.20260502214436.180_20260502_214436 Paper: pytrain.20260502214436.180	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 21:45	Success	-	View
exp_self.20260502213738.722_20260502_213738 Paper: self.20260502213738.722	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502213738.722 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 21:38	Success	-	View
exp_self.20260502213015.721_20260502_213015 Paper: self.20260502213015.721	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502213015.721 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 21:31	Success	-	View
exp_self.20260502212250.720_20260502_212250 Paper: self.20260502212250.720	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502212250.720 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 21:23	Success	-	View
exp_self.20260502211520.719_20260502_211521 Paper: self.20260502211520.719	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502211520.719 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 21:16	Success	-	View
exp_pytrain.20260502211253.179_20260502_211253 Paper: pytrain.20260502211253.179	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 21:13	Success	-	View
exp_self.20260502210555.718_20260502_210555 Paper: self.20260502210555.718	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502210555.718 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 21:06	Success	-	View
exp_self.20260502205826.717_20260502_205826 Paper: self.20260502205826.717	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502205826.717 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 20:59	Success	-	View
exp_self.20260502205103.716_20260502_205103 Paper: self.20260502205103.716	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502205103.716 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 20:52	Success	-	View
exp_self.20260502204338.715_20260502_204338 Paper: self.20260502204338.715	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502204338.715 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 20:44	Success	-	View
exp_pytrain.20260502204106.178_20260502_204106 Paper: pytrain.20260502204106.178	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 20:42	Success	-	View
exp_self.20260502203413.714_20260502_203413 Paper: self.20260502203413.714	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502203413.714 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 20:35	Success	-	View
exp_self.20260502202646.713_20260502_202647 Paper: self.20260502202646.713	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502202646.713 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 20:27	Success	-	View
exp_self.20260502201924.712_20260502_201925 Paper: self.20260502201924.712	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502201924.712 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 20:20	Success	-	View
exp_self.20260502201201.711_20260502_201201 Paper: self.20260502201201.711	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502201201.711 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 20:13	Success	-	View
exp_pytrain.20260502200929.177_20260502_200929 Paper: pytrain.20260502200929.177	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 20:10	Success	-	View
exp_self.20260502200239.710_20260502_200239 Paper: self.20260502200239.710	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502200239.710 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 20:03	Success	-	View
exp_self.20260502195512.709_20260502_195513 Paper: self.20260502195512.709	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502195512.709 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 19:56	Success	-	View
exp_self.20260502194746.708_20260502_194747 Paper: self.20260502194746.708	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502194746.708 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 19:48	Success	-	View
exp_self.20260502194023.707_20260502_194023 Paper: self.20260502194023.707	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502194023.707 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 19:41	Success	-	View
exp_pytrain.20260502193752.176_20260502_193753 Paper: pytrain.20260502193752.176	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 19:38	Success	-	View
exp_self.20260502193104.706_20260502_193105 Paper: self.20260502193104.706	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502193104.706 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 19:32	Success	-	View
exp_self.20260502192337.705_20260502_192337 Paper: self.20260502192337.705	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502192337.705 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 19:24	Success	-	View
exp_self.20260502191610.704_20260502_191611 Paper: self.20260502191610.704	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502191610.704 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 19:17	Success	-	View
exp_self.20260502190847.703_20260502_190847 Paper: self.20260502190847.703	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502190847.703 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 19:09	Success	-	View
exp_pytrain.20260502190617.175_20260502_190617 Paper: pytrain.20260502190617.175	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 19:07	Success	-	View
exp_self.20260502185928.702_20260502_185928 Paper: self.20260502185928.702	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502185928.702 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 19:00	Success	-	View
exp_self.20260502185156.701_20260502_185156 Paper: self.20260502185156.701	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502185156.701 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 18:52	Success	-	View
exp_self.20260502184430.700_20260502_184430 Paper: self.20260502184430.700	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502184430.700 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 18:45	Success	-	View
exp_self.20260502183704.699_20260502_183704 Paper: self.20260502183704.699	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502183704.699 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 18:38	Success	-	View
exp_pytrain.20260502183438.174_20260502_183438 Paper: pytrain.20260502183438.174	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 18:35	Success	-	View
exp_self.20260502182731.698_20260502_182731 Paper: self.20260502182731.698	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502182731.698 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 18:28	Success	-	View
exp_self.20260502181942.697_20260502_181942 Paper: self.20260502181942.697	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502181942.697 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 18:20	Success	-	View
exp_self.20260502181200.696_20260502_181200 Paper: self.20260502181200.696	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502181200.696 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 18:13	Success	-	View
exp_self.20260502180420.695_20260502_180421 Paper: self.20260502180420.695	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502180420.695 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 18:05	Success	-	View
exp_pytrain.20260502180150.173_20260502_180151 Paper: pytrain.20260502180150.173	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 18:02	Success	-	View
exp_self.20260502175434.694_20260502_175434 Paper: self.20260502175434.694	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502175434.694 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 17:55	Success	-	View
exp_self.20260502174702.693_20260502_174702 Paper: self.20260502174702.693	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502174702.693 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 17:48	Success	-	View
exp_self.20260502173924.692_20260502_173924 Paper: self.20260502173924.692	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502173924.692 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 17:40	Success	-	View
exp_self.20260502173148.691_20260502_173148 Paper: self.20260502173148.691	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502173148.691 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 17:32	Success	-	View
exp_pytrain.20260502172915.172_20260502_172915 Paper: pytrain.20260502172915.172	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 17:30	Success	-	View
exp_self.20260502172345.690_20260502_172346 Paper: self.20260502172345.690	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502172345.690 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 17:24	Success	-	View
exp_self.20260502171604.689_20260502_171605 Paper: self.20260502171604.689	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502171604.689 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 17:17	Success	-	View
exp_self.20260502170823.688_20260502_170823 Paper: self.20260502170823.688	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502170823.688 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 17:09	Success	-	View
exp_self.20260502170029.687_20260502_170029 Paper: self.20260502170029.687	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502170029.687 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 17:01	Success	-	View
exp_pytrain.20260502165745.171_20260502_165745 Paper: pytrain.20260502165745.171	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 16:58	Success	-	View
exp_self.20260502165049.686_20260502_165050 Paper: self.20260502165049.686	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502165049.686 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 16:51	Success	-	View
exp_self.20260502164311.685_20260502_164311 Paper: self.20260502164311.685	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502164311.685 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 16:44	Success	-	View
exp_self.20260502163538.684_20260502_163539 Paper: self.20260502163538.684	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502163538.684 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 16:36	Success	-	View
exp_self.20260502162807.683_20260502_162807 Paper: self.20260502162807.683	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502162807.683 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 16:29	Success	-	View
exp_pytrain.20260502162532.170_20260502_162532 Paper: pytrain.20260502162532.170	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 16:26	Success	-	View
exp_self.20260502161839.682_20260502_161839 Paper: self.20260502161839.682	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502161839.682 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 16:19	Success	-	View
exp_self.20260502161110.681_20260502_161111 Paper: self.20260502161110.681	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502161110.681 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 16:12	Success	-	View
exp_self.20260502160339.680_20260502_160340 Paper: self.20260502160339.680	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502160339.680 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 16:04	Success	-	View
exp_self.20260502155607.679_20260502_155608 Paper: self.20260502155607.679	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502155607.679 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 15:57	Success	-	View
exp_pytrain.20260502155334.169_20260502_155334 Paper: pytrain.20260502155334.169	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 15:54	Success	-	View
exp_self.20260502154635.678_20260502_154635 Paper: self.20260502154635.678	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502154635.678 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 15:47	Success	-	View
exp_self.20260502153910.677_20260502_153910 Paper: self.20260502153910.677	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502153910.677 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 15:40	Success	-	View
exp_self.20260502153142.676_20260502_153143 Paper: self.20260502153142.676	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502153142.676 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 15:32	Success	-	View
exp_self.20260502152407.675_20260502_152407 Paper: self.20260502152407.675	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502152407.675 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 15:25	Success	-	View
exp_pytrain.20260502152143.168_20260502_152143 Paper: pytrain.20260502152143.168	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 15:22	Success	-	View
exp_self.20260502151451.674_20260502_151452 Paper: self.20260502151451.674	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502151451.674 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 15:15	Success	-	View
exp_self.20260502150718.673_20260502_150718 Paper: self.20260502150718.673	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502150718.673 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 15:08	Success	-	View
exp_self.20260502145936.672_20260502_145937 Paper: self.20260502145936.672	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502145936.672 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 15:00	Success	-	View
exp_self.20260502145203.671_20260502_145204 Paper: self.20260502145203.671	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502145203.671 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 14:53	Success	-	View
exp_pytrain.20260502144933.167_20260502_144934 Paper: pytrain.20260502144933.167	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 14:50	Success	-	View
exp_self.20260502144241.670_20260502_144241 Paper: self.20260502144241.670	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502144241.670 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 14:43	Success	-	View
exp_self.20260502143511.669_20260502_143511 Paper: self.20260502143511.669	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502143511.669 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 14:36	Success	-	View
exp_self.20260502142738.668_20260502_142738 Paper: self.20260502142738.668	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502142738.668 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 14:28	Success	-	View
exp_self.20260502142003.667_20260502_142004 Paper: self.20260502142003.667	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502142003.667 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 14:21	Success	-	View
exp_pytrain.20260502141737.166_20260502_141738 Paper: pytrain.20260502141737.166	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 14:18	Success	-	View
exp_self.20260502141038.666_20260502_141038 Paper: self.20260502141038.666	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502141038.666 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 14:11	Success	-	View
exp_self.20260502140315.665_20260502_140315 Paper: self.20260502140315.665	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502140315.665 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 14:04	Success	-	View
exp_self.20260502135545.664_20260502_135545 Paper: self.20260502135545.664	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502135545.664 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 13:56	Success	-	View
exp_self.20260502134809.663_20260502_134809 Paper: self.20260502134809.663	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502134809.663 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 13:49	Success	-	View
exp_pytrain.20260502134535.165_20260502_134536 Paper: pytrain.20260502134535.165	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 13:46	Success	-	View
exp_self.20260502133830.662_20260502_133831 Paper: self.20260502133830.662	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502133830.662 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 13:39	Success	-	View
exp_self.20260502133059.661_20260502_133059 Paper: self.20260502133059.661	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502133059.661 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 13:32	Success	-	View
exp_self.20260502132326.660_20260502_132326 Paper: self.20260502132326.660	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502132326.660 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 13:24	Success	-	View
exp_self.20260502131555.659_20260502_131555 Paper: self.20260502131555.659	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502131555.659 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 13:16	Success	-	View
exp_pytrain.20260502131317.164_20260502_131317 Paper: pytrain.20260502131317.164	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 13:14	Success	-	View
exp_self.20260502130614.658_20260502_130615 Paper: self.20260502130614.658	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502130614.658 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 13:07	Success	-	View
exp_self.20260502125843.657_20260502_125843 Paper: self.20260502125843.657	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502125843.657 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 12:59	Success	-	View
exp_self.20260502125058.656_20260502_125058 Paper: self.20260502125058.656	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502125058.656 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 12:52	Success	-	View
exp_self.20260502124327.655_20260502_124327 Paper: self.20260502124327.655	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502124327.655 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 12:44	Success	-	View
exp_pytrain.20260502124051.163_20260502_124051 Paper: pytrain.20260502124051.163	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 12:41	Success	-	View
exp_self.20260502123355.654_20260502_123355 Paper: self.20260502123355.654	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502123355.654 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 12:34	Success	-	View
exp_self.20260502122623.653_20260502_122624 Paper: self.20260502122623.653	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502122623.653 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 12:27	Success	-	View
exp_self.20260502121855.652_20260502_121856 Paper: self.20260502121855.652	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502121855.652 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 12:19	Success	-	View
exp_self.20260502121133.651_20260502_121133 Paper: self.20260502121133.651	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502121133.651 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 12:12	Success	-	View
exp_pytrain.20260502120902.162_20260502_120902 Paper: pytrain.20260502120902.162	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 12:10	Success	-	View
exp_self.20260502120215.650_20260502_120215 Paper: self.20260502120215.650	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502120215.650 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 12:03	Success	-	View
exp_self.20260502115447.649_20260502_115448 Paper: self.20260502115447.649	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502115447.649 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 11:55	Success	-	View
exp_gh_mcarbonell_supermario-optimizer_20260502_115135 Paper: gh_mcarbonell_supermario-optimizer	mcarbonell/supermario-optimizer Paper ID: gh_mcarbonell_supermario-optimizer - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal:...	05-02 11:52	Success	-	View
exp_self.20260502114719.648_20260502_114719 Paper: self.20260502114719.648	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502114719.648 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 11:48	Success	-	View
exp_self.20260502113935.647_20260502_113936 Paper: self.20260502113935.647	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502113935.647 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 11:40	Success	-	View
exp_pytrain.20260502113709.161_20260502_113710 Paper: pytrain.20260502113709.161	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 11:38	Success	-	View
exp_self.20260502113005.646_20260502_113005 Paper: self.20260502113005.646	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502113005.646 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 11:31	Success	-	View
exp_self.20260502112242.645_20260502_112242 Paper: self.20260502112242.645	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502112242.645 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 11:23	Success	-	View
exp_self.20260502111517.644_20260502_111517 Paper: self.20260502111517.644	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502111517.644 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 11:16	Success	-	View
exp_self.20260502110748.643_20260502_110749 Paper: self.20260502110748.643	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502110748.643 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 11:08	Success	-	View
exp_pytrain.20260502110519.160_20260502_110520 Paper: pytrain.20260502110519.160	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 11:06	Success	-	View
exp_self.20260502105818.642_20260502_105819 Paper: self.20260502105818.642	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502105818.642 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 10:59	Success	-	View
exp_self.20260502105049.641_20260502_105049 Paper: self.20260502105049.641	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502105049.641 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 10:51	Success	-	View
exp_self.20260502104325.640_20260502_104326 Paper: self.20260502104325.640	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502104325.640 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 10:44	Success	-	View
exp_self.20260502103600.639_20260502_103600 Paper: self.20260502103600.639	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502103600.639 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 10:37	Success	-	View
exp_pytrain.20260502103329.159_20260502_103329 Paper: pytrain.20260502103329.159	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 10:34	Success	-	View
exp_self.20260502102638.638_20260502_102639 Paper: self.20260502102638.638	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502102638.638 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 10:27	Success	-	View
exp_self.20260502101913.637_20260502_101913 Paper: self.20260502101913.637	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502101913.637 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 10:20	Success	-	View
exp_self.20260502101133.636_20260502_101134 Paper: self.20260502101133.636	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502101133.636 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 10:12	Success	-	View
exp_self.20260502100405.635_20260502_100405 Paper: self.20260502100405.635	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502100405.635 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 10:05	Success	-	View
exp_pytrain.20260502100133.158_20260502_100133 Paper: pytrain.20260502100133.158	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 10:02	Success	-	View
exp_self.20260502095441.634_20260502_095442 Paper: self.20260502095441.634	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502095441.634 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 09:55	Success	-	View
exp_self.20260502094716.633_20260502_094716 Paper: self.20260502094716.633	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502094716.633 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 09:48	Success	-	View
exp_self.20260502093949.632_20260502_093949 Paper: self.20260502093949.632	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502093949.632 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 09:40	Success	-	View
exp_self.20260502093226.631_20260502_093226 Paper: self.20260502093226.631	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502093226.631 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 09:33	Success	-	View
exp_pytrain.20260502092955.157_20260502_092955 Paper: pytrain.20260502092955.157	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 09:30	Success	-	View
exp_self.20260502092305.630_20260502_092305 Paper: self.20260502092305.630	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502092305.630 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 09:24	Success	-	View
exp_self.20260502091537.629_20260502_091538 Paper: self.20260502091537.629	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502091537.629 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 09:16	Success	-	View
exp_self.20260502090813.628_20260502_090813 Paper: self.20260502090813.628	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502090813.628 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 09:09	Success	-	View
exp_self.20260502090047.627_20260502_090047 Paper: self.20260502090047.627	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502090047.627 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 09:01	Success	-	View
exp_pytrain.20260502085817.156_20260502_085818 Paper: pytrain.20260502085817.156	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 08:59	Success	-	View
exp_self.20260502085131.626_20260502_085132 Paper: self.20260502085131.626	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502085131.626 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 08:52	Success	-	View
exp_self.20260502084403.625_20260502_084403 Paper: self.20260502084403.625	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502084403.625 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 08:45	Success	-	View
exp_self.20260502083638.624_20260502_083638 Paper: self.20260502083638.624	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502083638.624 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 08:37	Success	-	View
exp_self.20260502082911.623_20260502_082911 Paper: self.20260502082911.623	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502082911.623 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 08:30	Success	-	View
exp_pytrain.20260502082641.155_20260502_082642 Paper: pytrain.20260502082641.155	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 08:27	Success	-	View
exp_self.20260502081947.622_20260502_081947 Paper: self.20260502081947.622	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502081947.622 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 08:20	Success	-	View
exp_self.20260502081218.621_20260502_081218 Paper: self.20260502081218.621	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502081218.621 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 08:13	Success	-	View
exp_self.20260502080443.620_20260502_080443 Paper: self.20260502080443.620	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502080443.620 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 08:05	Success	-	View
exp_self.20260502075711.619_20260502_075711 Paper: self.20260502075711.619	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502075711.619 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 07:58	Success	-	View
exp_pytrain.20260502075443.154_20260502_075443 Paper: pytrain.20260502075443.154	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 07:55	Success	-	View
exp_self.20260502074743.618_20260502_074744 Paper: self.20260502074743.618	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502074743.618 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 07:48	Success	-	View
exp_self.20260502074020.617_20260502_074020 Paper: self.20260502074020.617	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502074020.617 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 07:41	Success	-	View
exp_self.20260502073252.616_20260502_073253 Paper: self.20260502073252.616	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502073252.616 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 07:33	Success	-	View
exp_self.20260502072527.615_20260502_072527 Paper: self.20260502072527.615	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502072527.615 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 07:26	Success	-	View
exp_pytrain.20260502072302.153_20260502_072302 Paper: pytrain.20260502072302.153	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 07:24	Success	-	View
exp_self.20260502071604.614_20260502_071604 Paper: self.20260502071604.614	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502071604.614 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 07:17	Success	-	View
exp_self.20260502070841.613_20260502_070841 Paper: self.20260502070841.613	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502070841.613 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 07:09	Success	-	View
exp_self.20260502070115.612_20260502_070115 Paper: self.20260502070115.612	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502070115.612 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 07:02	Success	-	View
exp_self.20260502065339.611_20260502_065340 Paper: self.20260502065339.611	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502065339.611 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 06:54	Success	-	View
exp_pytrain.20260502065106.152_20260502_065107 Paper: pytrain.20260502065106.152	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 06:52	Success	-	View
exp_gh_airdropkalami_awesome-gpu-for-llm_20260502_064816 Paper: gh_airdropkalami_awesome-gpu-for-llm	airdropkalami/awesome-gpu-for-llm Paper ID: gh_airdropkalami_awesome-gpu-for-llm - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signa...	05-02 06:49	Success	-	View
exp_self.20260502064458.610_20260502_064458 Paper: self.20260502064458.610	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502064458.610 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 06:46	Success	-	View
exp_self.20260502063727.609_20260502_063727 Paper: self.20260502063727.609	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502063727.609 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 06:38	Success	-	View
exp_self.20260502062955.608_20260502_062956 Paper: self.20260502062955.608	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502062955.608 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 06:30	Success	-	View
exp_self.20260502062212.607_20260502_062213 Paper: self.20260502062212.607	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502062212.607 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 06:23	Success	-	View
exp_pytrain.20260502061933.151_20260502_061933 Paper: pytrain.20260502061933.151	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 06:20	Success	-	View
exp_self.20260502061224.606_20260502_061225 Paper: self.20260502061224.606	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502061224.606 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 06:13	Success	-	View
exp_self.20260502060449.605_20260502_060450 Paper: self.20260502060449.605	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502060449.605 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 06:05	Success	-	View
exp_self.20260502055718.604_20260502_055718 Paper: self.20260502055718.604	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502055718.604 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 05:58	Success	-	View
exp_self.20260502054945.603_20260502_054945 Paper: self.20260502054945.603	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502054945.603 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 05:50	Success	-	View
exp_pytrain.20260502054712.150_20260502_054712 Paper: pytrain.20260502054712.150	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 05:48	Success	-	View
exp_self.20260502054005.602_20260502_054005 Paper: self.20260502054005.602	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502054005.602 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 05:41	Success	-	View
exp_self.20260502053235.601_20260502_053235 Paper: self.20260502053235.601	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502053235.601 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 05:33	Success	-	View
exp_self.20260502052508.600_20260502_052509 Paper: self.20260502052508.600	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502052508.600 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 05:26	Success	-	View
exp_self.20260502051738.599_20260502_051739 Paper: self.20260502051738.599	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502051738.599 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 05:18	Success	-	View
exp_pytrain.20260502051507.149_20260502_051507 Paper: pytrain.20260502051507.149	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 05:16	Success	-	View
exp_self.20260502050805.598_20260502_050805 Paper: self.20260502050805.598	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502050805.598 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 05:09	Success	-	View
exp_self.20260502050031.597_20260502_050032 Paper: self.20260502050031.597	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502050031.597 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 05:01	Success	-	View
exp_self.20260502045254.596_20260502_045254 Paper: self.20260502045254.596	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502045254.596 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 04:53	Success	-	View
exp_self.20260502044524.595_20260502_044525 Paper: self.20260502044524.595	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502044524.595 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 04:46	Success	-	View
exp_pytrain.20260502044246.148_20260502_044247 Paper: pytrain.20260502044246.148	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 04:43	Success	-	View
exp_self.20260502043552.594_20260502_043553 Paper: self.20260502043552.594	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502043552.594 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 04:36	Success	-	View
exp_self.20260502042826.593_20260502_042826 Paper: self.20260502042826.593	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502042826.593 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 04:29	Success	-	View
exp_self.20260502042102.592_20260502_042102 Paper: self.20260502042102.592	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502042102.592 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 04:22	Success	-	View
exp_self.20260502041354.591_20260502_041354 Paper: self.20260502041354.591	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502041354.591 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 04:14	Success	-	View
exp_pytrain.20260502041129.147_20260502_041129 Paper: pytrain.20260502041129.147	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 04:12	Success	-	View
exp_self.20260502040430.590_20260502_040430 Paper: self.20260502040430.590	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502040430.590 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 04:05	Success	-	View
exp_self.20260502035705.589_20260502_035705 Paper: self.20260502035705.589	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502035705.589 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 03:58	Success	-	View
exp_self.20260502034941.588_20260502_034941 Paper: self.20260502034941.588	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502034941.588 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 03:50	Success	-	View
exp_self.20260502034212.587_20260502_034212 Paper: self.20260502034212.587	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502034212.587 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 03:43	Success	-	View
exp_pytrain.20260502033947.146_20260502_033947 Paper: pytrain.20260502033947.146	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 03:40	Success	-	View
exp_self.20260502033246.586_20260502_033246 Paper: self.20260502033246.586	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502033246.586 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 03:33	Success	-	View
exp_self.20260502032524.585_20260502_032524 Paper: self.20260502032524.585	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502032524.585 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 03:26	Success	-	View
exp_self.20260502031759.584_20260502_031800 Paper: self.20260502031759.584	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502031759.584 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 03:19	Success	-	View
exp_self.20260502031030.583_20260502_031030 Paper: self.20260502031030.583	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502031030.583 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 03:11	Success	-	View
exp_pytrain.20260502030802.145_20260502_030802 Paper: pytrain.20260502030802.145	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 03:09	Success	-	View
exp_self.20260502030104.582_20260502_030104 Paper: self.20260502030104.582	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502030104.582 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 03:02	Success	-	View
exp_self.20260502025341.581_20260502_025342 Paper: self.20260502025341.581	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502025341.581 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 02:54	Success	-	View
exp_self.20260502024618.580_20260502_024618 Paper: self.20260502024618.580	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502024618.580 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 02:47	Success	-	View
exp_self.20260502023853.579_20260502_023854 Paper: self.20260502023853.579	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502023853.579 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 02:39	Success	-	View
exp_pytrain.20260502023621.144_20260502_023621 Paper: pytrain.20260502023621.144	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 02:37	Success	-	View
exp_self.20260502023104.578_20260502_023104 Paper: self.20260502023104.578	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502023104.578 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 02:32	Success	-	View
exp_gh_cryptopoly_ChaosEngineAI_20260502_022642 Paper: gh_cryptopoly_ChaosEngineAI	cryptopoly/ChaosEngineAI Paper ID: gh_cryptopoly_ChaosEngineAI - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recove...	05-02 02:27	Success	-	View
exp_self.20260502022329.577_20260502_022329 Paper: self.20260502022329.577	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502022329.577 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 02:24	Success	-	View
exp_self.20260502021605.576_20260502_021605 Paper: self.20260502021605.576	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502021605.576 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 02:17	Success	-	View
exp_self.20260502020836.575_20260502_020837 Paper: self.20260502020836.575	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502020836.575 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 02:09	Success	-	View
exp_pytrain.20260502020459.143_20260502_020500 Paper: pytrain.20260502020459.143	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 02:06	Success	-	View
exp_self.20260502020046.574_20260502_020046 Paper: self.20260502020046.574	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502020046.574 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 02:01	Success	-	View
exp_self.20260502015322.573_20260502_015322 Paper: self.20260502015322.573	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502015322.573 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 01:54	Success	-	View
exp_self.20260502014548.572_20260502_014548 Paper: self.20260502014548.572	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502014548.572 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 01:46	Success	-	View
exp_self.20260502013820.571_20260502_013820 Paper: self.20260502013820.571	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502013820.571 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 01:39	Success	-	View
exp_pytrain.20260502013337.142_20260502_013337 Paper: pytrain.20260502013337.142	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 01:34	Success	-	View
exp_self.20260502013139.570_20260502_013139 Paper: self.20260502013139.570	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502013139.570 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 01:32	Success	-	View
exp_self.20260502012415.569_20260502_012415 Paper: self.20260502012415.569	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502012415.569 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 01:25	Success	-	View
exp_self.20260502011646.568_20260502_011646 Paper: self.20260502011646.568	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502011646.568 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 01:17	Success	-	View
exp_self.20260502010919.567_20260502_010920 Paper: self.20260502010919.567	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502010919.567 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 01:10	Success	-	View
exp_self.20260502010237.566_20260502_010238 Paper: self.20260502010237.566	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502010237.566 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 01:03	Success	-	View
exp_pytrain.20260502010011.141_20260502_010011 Paper: pytrain.20260502010011.141	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 01:01	Success	-	View
exp_self.20260502005311.565_20260502_005312 Paper: self.20260502005311.565	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502005311.565 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 00:54	Success	-	View
exp_self.20260502004550.564_20260502_004550 Paper: self.20260502004550.564	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502004550.564 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 00:46	Success	-	View
exp_self.20260502003815.563_20260502_003816 Paper: self.20260502003815.563	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502003815.563 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 00:39	Success	-	View
exp_self.20260502003049.562_20260502_003049 Paper: self.20260502003049.562	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502003049.562 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 00:31	Success	-	View
exp_pytrain.20260502002821.140_20260502_002821 Paper: pytrain.20260502002821.140	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-02 00:29	Success	-	View
exp_self.20260502002129.561_20260502_002129 Paper: self.20260502002129.561	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502002129.561 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 00:22	Success	-	View
exp_self.20260502001402.560_20260502_001402 Paper: self.20260502001402.560	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502001402.560 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 00:15	Success	-	View
exp_self.20260502000638.559_20260502_000638 Paper: self.20260502000638.559	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260502000638.559 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 00:07	Success	-	View
exp_self.20260501235902.558_20260501_235902 Paper: self.20260501235902.558	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501235902.558 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-02 00:00	Success	-	View
exp_pytrain.20260501235632.139_20260501_235632 Paper: pytrain.20260501235632.139	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 23:57	Success	-	View
exp_self.20260501234941.557_20260501_234941 Paper: self.20260501234941.557	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501234941.557 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 23:50	Success	-	View
exp_self.20260501234216.556_20260501_234217 Paper: self.20260501234216.556	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501234216.556 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 23:43	Success	-	View
exp_self.20260501233454.555_20260501_233455 Paper: self.20260501233454.555	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501233454.555 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 23:35	Success	-	View
exp_self.20260501232731.554_20260501_232731 Paper: self.20260501232731.554	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501232731.554 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 23:28	Success	-	View
exp_pytrain.20260501232459.138_20260501_232459 Paper: pytrain.20260501232459.138	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 23:26	Success	-	View
exp_self.20260501231808.553_20260501_231808 Paper: self.20260501231808.553	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501231808.553 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 23:19	Success	-	View
exp_self.20260501231039.552_20260501_231039 Paper: self.20260501231039.552	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501231039.552 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 23:11	Success	-	View
exp_self.20260501230316.551_20260501_230316 Paper: self.20260501230316.551	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501230316.551 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 23:04	Success	-	View
exp_self.20260501225553.550_20260501_225554 Paper: self.20260501225553.550	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501225553.550 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 22:56	Success	-	View
exp_pytrain.20260501225321.137_20260501_225321 Paper: pytrain.20260501225321.137	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 22:54	Success	-	View
exp_self.20260501224633.549_20260501_224633 Paper: self.20260501224633.549	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501224633.549 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 22:47	Success	-	View
exp_self.20260501223904.548_20260501_223904 Paper: self.20260501223904.548	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501223904.548 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 22:40	Success	-	View
exp_self.20260501223139.547_20260501_223139 Paper: self.20260501223139.547	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501223139.547 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 22:32	Success	-	View
exp_self.20260501222416.546_20260501_222417 Paper: self.20260501222416.546	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501222416.546 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 22:25	Success	-	View
exp_pytrain.20260501222146.136_20260501_222147 Paper: pytrain.20260501222146.136	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 22:22	Success	-	View
exp_self.20260501221500.545_20260501_221500 Paper: self.20260501221500.545	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501221500.545 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 22:16	Success	-	View
exp_self.20260501220729.544_20260501_220729 Paper: self.20260501220729.544	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501220729.544 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 22:08	Success	-	View
exp_self.20260501220004.543_20260501_220004 Paper: self.20260501220004.543	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501220004.543 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 22:01	Success	-	View
exp_self.20260501215239.542_20260501_215240 Paper: self.20260501215239.542	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501215239.542 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 21:53	Success	-	View
exp_pytrain.20260501215015.135_20260501_215015 Paper: pytrain.20260501215015.135	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 21:51	Success	-	View
exp_self.20260501214324.541_20260501_214324 Paper: self.20260501214324.541	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501214324.541 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 21:44	Success	-	View
exp_self.20260501213557.540_20260501_213558 Paper: self.20260501213557.540	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501213557.540 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 21:37	Success	-	View
exp_self.20260501212829.539_20260501_212829 Paper: self.20260501212829.539	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501212829.539 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 21:29	Success	-	View
exp_self.20260501212103.538_20260501_212103 Paper: self.20260501212103.538	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501212103.538 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 21:22	Success	-	View
exp_pytrain.20260501211839.134_20260501_211839 Paper: pytrain.20260501211839.134	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 21:19	Success	-	View
exp_gh_Pearlfisheryjersey8695_kalshiquant_20260501_211555 Paper: gh_Pearlfisheryjersey8695_kalshiquant	Pearlfisheryjersey8695/kalshiquant Paper ID: gh_Pearlfisheryjersey8695_kalshiquant - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Sign...	05-01 21:16	Success	-	View
exp_self.20260501211028.537_20260501_211029 Paper: self.20260501211028.537	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501211028.537 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 21:11	Success	-	View
exp_self.20260501210306.536_20260501_210306 Paper: self.20260501210306.536	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501210306.536 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 21:04	Success	-	View
exp_self.20260501205539.535_20260501_205540 Paper: self.20260501205539.535	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501205539.535 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 20:56	Success	-	View
exp_self.20260501204816.534_20260501_204816 Paper: self.20260501204816.534	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501204816.534 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 20:49	Success	-	View
exp_pytrain.20260501204551.133_20260501_204551 Paper: pytrain.20260501204551.133	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 20:46	Success	-	View
exp_self.20260501203854.533_20260501_203854 Paper: self.20260501203854.533	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501203854.533 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 20:39	Success	-	View
exp_self.20260501203129.532_20260501_203130 Paper: self.20260501203129.532	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501203129.532 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 20:32	Success	-	View
exp_self.20260501202406.531_20260501_202406 Paper: self.20260501202406.531	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501202406.531 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 20:25	Success	-	View
exp_self.20260501201633.530_20260501_201633 Paper: self.20260501201633.530	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501201633.530 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 20:17	Success	-	View
exp_pytrain.20260501201407.132_20260501_201408 Paper: pytrain.20260501201407.132	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 20:15	Success	-	View
exp_self.20260501200709.529_20260501_200710 Paper: self.20260501200709.529	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501200709.529 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 20:08	Success	-	View
exp_self.20260501195944.528_20260501_195944 Paper: self.20260501195944.528	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501195944.528 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 20:00	Success	-	View
exp_self.20260501195221.527_20260501_195222 Paper: self.20260501195221.527	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501195221.527 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 19:53	Success	-	View
exp_self.20260501194453.526_20260501_194453 Paper: self.20260501194453.526	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501194453.526 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 19:45	Success	-	View
exp_pytrain.20260501194226.131_20260501_194226 Paper: pytrain.20260501194226.131	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 19:43	Success	-	View
exp_self.20260501193535.525_20260501_193535 Paper: self.20260501193535.525	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501193535.525 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 19:36	Success	-	View
exp_self.20260501192811.524_20260501_192811 Paper: self.20260501192811.524	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501192811.524 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 19:29	Success	-	View
exp_self.20260501192047.523_20260501_192047 Paper: self.20260501192047.523	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501192047.523 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 19:21	Success	-	View
exp_self.20260501191322.522_20260501_191322 Paper: self.20260501191322.522	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501191322.522 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 19:14	Success	-	View
exp_pytrain.20260501191050.130_20260501_191051 Paper: pytrain.20260501191050.130	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 19:11	Success	-	View
exp_self.20260501190354.521_20260501_190355 Paper: self.20260501190354.521	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501190354.521 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 19:04	Success	-	View
exp_self.20260501185626.520_20260501_185627 Paper: self.20260501185626.520	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501185626.520 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 18:57	Success	-	View
exp_self.20260501184858.519_20260501_184858 Paper: self.20260501184858.519	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501184858.519 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 18:50	Success	-	View
exp_self.20260501184129.518_20260501_184129 Paper: self.20260501184129.518	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501184129.518 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 18:42	Success	-	View
exp_pytrain.20260501183854.129_20260501_183855 Paper: pytrain.20260501183854.129	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 18:39	Success	-	View
exp_self.20260501183154.517_20260501_183154 Paper: self.20260501183154.517	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501183154.517 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 18:32	Success	-	View
exp_self.20260501182421.516_20260501_182421 Paper: self.20260501182421.516	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501182421.516 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 18:25	Success	-	View
exp_self.20260501181633.515_20260501_181633 Paper: self.20260501181633.515	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501181633.515 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 18:17	Success	-	View
exp_self.20260501180900.514_20260501_180901 Paper: self.20260501180900.514	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501180900.514 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 18:10	Success	-	View
exp_pytrain.20260501180610.128_20260501_180610 Paper: pytrain.20260501180610.128	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 18:07	Success	-	View
exp_self.20260501175917.513_20260501_175918 Paper: self.20260501175917.513	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501175917.513 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 18:00	Success	-	View
exp_self.20260501175141.512_20260501_175142 Paper: self.20260501175141.512	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501175141.512 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 17:52	Success	-	View
exp_self.20260501174359.511_20260501_174359 Paper: self.20260501174359.511	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501174359.511 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 17:45	Success	-	View
exp_self.20260501173631.510_20260501_173631 Paper: self.20260501173631.510	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501173631.510 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 17:37	Success	-	View
exp_pytrain.20260501173356.127_20260501_173357 Paper: pytrain.20260501173356.127	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 17:34	Success	-	View
exp_self.20260501172705.509_20260501_172705 Paper: self.20260501172705.509	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501172705.509 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 17:28	Success	-	View
exp_self.20260501171932.508_20260501_171932 Paper: self.20260501171932.508	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501171932.508 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 17:20	Success	-	View
exp_self.20260501171202.507_20260501_171202 Paper: self.20260501171202.507	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501171202.507 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 17:13	Success	-	View
exp_self.20260501170425.506_20260501_170426 Paper: self.20260501170425.506	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501170425.506 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 17:05	Success	-	View
exp_pytrain.20260501170155.126_20260501_170155 Paper: pytrain.20260501170155.126	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 17:02	Success	-	View
exp_self.20260501165450.505_20260501_165450 Paper: self.20260501165450.505	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501165450.505 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 16:55	Success	-	View
exp_self.20260501164656.504_20260501_164657 Paper: self.20260501164656.504	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501164656.504 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 16:47	Success	-	View
exp_self.20260501163922.503_20260501_163922 Paper: self.20260501163922.503	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501163922.503 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 16:40	Success	-	View
exp_self.20260501163154.502_20260501_163154 Paper: self.20260501163154.502	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501163154.502 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 16:32	Success	-	View
exp_pytrain.20260501162929.125_20260501_162930 Paper: pytrain.20260501162929.125	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 16:30	Success	-	View
exp_self.20260501162231.501_20260501_162231 Paper: self.20260501162231.501	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501162231.501 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 16:23	Success	-	View
exp_self.20260501161506.500_20260501_161506 Paper: self.20260501161506.500	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501161506.500 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 16:16	Success	-	View
exp_self.20260501160737.499_20260501_160737 Paper: self.20260501160737.499	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501160737.499 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 16:08	Success	-	View
exp_self.20260501155947.498_20260501_155947 Paper: self.20260501155947.498	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501155947.498 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 16:00	Success	-	View
exp_pytrain.20260501155717.124_20260501_155717 Paper: pytrain.20260501155717.124	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 15:58	Success	-	View
exp_self.20260501155013.497_20260501_155014 Paper: self.20260501155013.497	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501155013.497 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 15:51	Success	-	View
exp_self.20260501154243.496_20260501_154244 Paper: self.20260501154243.496	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501154243.496 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 15:43	Success	-	View
exp_self.20260501153512.495_20260501_153512 Paper: self.20260501153512.495	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501153512.495 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 15:36	Success	-	View
exp_self.20260501152738.494_20260501_152739 Paper: self.20260501152738.494	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501152738.494 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 15:28	Success	-	View
exp_pytrain.20260501152509.123_20260501_152510 Paper: pytrain.20260501152509.123	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 15:26	Success	-	View
exp_self.20260501151806.493_20260501_151806 Paper: self.20260501151806.493	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501151806.493 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 15:19	Success	-	View
exp_self.20260501151038.492_20260501_151038 Paper: self.20260501151038.492	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501151038.492 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 15:11	Success	-	View
exp_self.20260501150306.491_20260501_150307 Paper: self.20260501150306.491	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501150306.491 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 15:04	Success	-	View
exp_self.20260501145532.490_20260501_145532 Paper: self.20260501145532.490	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501145532.490 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 14:56	Success	-	View
exp_pytrain.20260501145301.122_20260501_145301 Paper: pytrain.20260501145301.122	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 14:54	Success	-	View
exp_self.20260501144600.489_20260501_144601 Paper: self.20260501144600.489	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501144600.489 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 14:47	Success	-	View
exp_hf_2604.24954_20260501_144136 Paper: hf_2604.24954	Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence Paper ID: hf_2604.24954 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-01 14:42	Success	-	View
exp_self.20260501143826.488_20260501_143826 Paper: self.20260501143826.488	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501143826.488 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 14:39	Success	-	View
exp_self.20260501143051.487_20260501_143052 Paper: self.20260501143051.487	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501143051.487 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 14:31	Success	-	View
exp_self.20260501142327.486_20260501_142328 Paper: self.20260501142327.486	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501142327.486 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 14:24	Success	-	View
exp_pytrain.20260501142102.121_20260501_142103 Paper: pytrain.20260501142102.121	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 14:22	Success	-	View
exp_self.20260501141411.485_20260501_141411 Paper: self.20260501141411.485	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501141411.485 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 14:15	Success	-	View
exp_self.20260501140645.484_20260501_140645 Paper: self.20260501140645.484	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501140645.484 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 14:07	Success	-	View
exp_self.20260501135915.483_20260501_135916 Paper: self.20260501135915.483	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501135915.483 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 14:00	Success	-	View
exp_self.20260501135149.482_20260501_135150 Paper: self.20260501135149.482	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501135149.482 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 13:52	Success	-	View
exp_pytrain.20260501134925.120_20260501_134925 Paper: pytrain.20260501134925.120	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 13:50	Success	-	View
exp_self.20260501134227.481_20260501_134227 Paper: self.20260501134227.481	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501134227.481 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 13:43	Success	-	View
exp_self.20260501133501.480_20260501_133501 Paper: self.20260501133501.480	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501133501.480 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 13:36	Success	-	View
exp_self.20260501132732.479_20260501_132732 Paper: self.20260501132732.479	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501132732.479 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 13:28	Success	-	View
exp_self.20260501132001.478_20260501_132002 Paper: self.20260501132001.478	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501132001.478 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 13:21	Success	-	View
exp_pytrain.20260501131733.119_20260501_131734 Paper: pytrain.20260501131733.119	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 13:18	Success	-	View
exp_self.20260501131032.477_20260501_131033 Paper: self.20260501131032.477	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501131032.477 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 13:11	Success	-	View
exp_self.20260501130305.476_20260501_130305 Paper: self.20260501130305.476	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501130305.476 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 13:04	Success	-	View
exp_self.20260501125536.475_20260501_125536 Paper: self.20260501125536.475	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501125536.475 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 12:56	Success	-	View
exp_self.20260501124805.474_20260501_124805 Paper: self.20260501124805.474	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501124805.474 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 12:49	Success	-	View
exp_pytrain.20260501124538.118_20260501_124538 Paper: pytrain.20260501124538.118	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 12:46	Success	-	View
exp_self.20260501124018.473_20260501_124018 Paper: self.20260501124018.473	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501124018.473 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 12:41	Success	-	View
exp_hf_2604.27151_20260501_123658 Paper: hf_2604.27151	Step-level Optimization for Efficient Computer-use Agents Paper ID: hf_2604.27151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-01 12:38	Success	-	View
exp_self.20260501123130.472_20260501_123131 Paper: self.20260501123130.472	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501123130.472 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 12:32	Success	-	View
exp_self.20260501122400.471_20260501_122401 Paper: self.20260501122400.471	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501122400.471 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 12:25	Success	-	View
exp_self.20260501121634.470_20260501_121634 Paper: self.20260501121634.470	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501121634.470 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 12:17	Success	-	View
exp_pytrain.20260501121401.117_20260501_121402 Paper: pytrain.20260501121401.117	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 12:15	Success	-	View
exp_self.20260501120711.469_20260501_120711 Paper: self.20260501120711.469	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501120711.469 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 12:08	Success	-	View
exp_self.20260501115936.468_20260501_115936 Paper: self.20260501115936.468	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501115936.468 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 12:00	Success	-	View
exp_self.20260501115207.467_20260501_115208 Paper: self.20260501115207.467	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501115207.467 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 11:53	Success	-	View
exp_self.20260501114438.466_20260501_114438 Paper: self.20260501114438.466	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501114438.466 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 11:45	Success	-	View
exp_pytrain.20260501114209.116_20260501_114209 Paper: pytrain.20260501114209.116	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 11:43	Success	-	View
exp_self.20260501113459.465_20260501_113500 Paper: self.20260501113459.465	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501113459.465 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 11:36	Success	-	View
exp_self.20260501112728.464_20260501_112728 Paper: self.20260501112728.464	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501112728.464 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 11:28	Success	-	View
exp_self.20260501111952.463_20260501_111953 Paper: self.20260501111952.463	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501111952.463 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 11:20	Success	-	View
exp_self.20260501111220.462_20260501_111221 Paper: self.20260501111220.462	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501111220.462 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 11:13	Success	-	View
exp_pytrain.20260501110950.115_20260501_110950 Paper: pytrain.20260501110950.115	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 11:10	Success	-	View
exp_self.20260501110244.461_20260501_110244 Paper: self.20260501110244.461	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501110244.461 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 11:03	Success	-	View
exp_self.20260501105540.460_20260501_105540 Paper: self.20260501105540.460	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501105540.460 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 10:56	Success	-	View
exp_self.20260501104837.459_20260501_104837 Paper: self.20260501104837.459	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501104837.459 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 10:49	Success	-	View
exp_self.20260501104137.458_20260501_104137 Paper: self.20260501104137.458	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501104137.458 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 10:42	Success	-	View
exp_pytrain.20260501103812.114_20260501_103812 Paper: pytrain.20260501103812.114	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 10:39	Success	-	View
exp_self.20260501103207.457_20260501_103208 Paper: self.20260501103207.457	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501103207.457 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 10:33	Success	-	View
exp_self.20260501102502.456_20260501_102502 Paper: self.20260501102502.456	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501102502.456 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 10:26	Success	-	View
exp_self.20260501101638.455_20260501_101639 Paper: self.20260501101638.455	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501101638.455 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 10:17	Success	-	View
exp_self.20260501100932.454_20260501_100932 Paper: self.20260501100932.454	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501100932.454 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 10:10	Success	-	View
exp_pytrain.20260501100620.113_20260501_100621 Paper: pytrain.20260501100620.113	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 10:07	Success	-	View
exp_self.20260501095942.453_20260501_095942 Paper: self.20260501095942.453	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501095942.453 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 10:00	Success	-	View
exp_self.20260501095231.452_20260501_095231 Paper: self.20260501095231.452	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501095231.452 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 09:53	Success	-	View
exp_self.20260501094529.451_20260501_094529 Paper: self.20260501094529.451	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501094529.451 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 09:46	Success	-	View
exp_hf_2604.28157_20260501_094142 Paper: hf_2604.28157	FlashRT: Towards Computationally and Memory Efficient Red-Teaming for Prompt Injection and Knowledge Corruption Paper ID: hf_2604.28157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-01 09:42	Success	-	View
exp_self.20260501093641.450_20260501_093641 Paper: self.20260501093641.450	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501093641.450 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 09:37	Success	-	View
exp_pytrain.20260501093349.112_20260501_093350 Paper: pytrain.20260501093349.112	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 09:34	Success	-	View
exp_self.20260501092742.449_20260501_092742 Paper: self.20260501092742.449	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501092742.449 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 09:28	Success	-	View
exp_self.20260501092005.448_20260501_092006 Paper: self.20260501092005.448	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501092005.448 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 09:21	Success	-	View
exp_self.20260501091233.447_20260501_091233 Paper: self.20260501091233.447	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501091233.447 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 09:13	Success	-	View
exp_self.20260501090500.446_20260501_090500 Paper: self.20260501090500.446	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501090500.446 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 09:06	Success	-	View
exp_pytrain.20260501090227.111_20260501_090227 Paper: pytrain.20260501090227.111	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 09:03	Success	-	View
exp_self.20260501085522.445_20260501_085522 Paper: self.20260501085522.445	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501085522.445 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 08:56	Success	-	View
exp_self.20260501084747.444_20260501_084748 Paper: self.20260501084747.444	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501084747.444 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 08:48	Success	-	View
exp_self.20260501084010.443_20260501_084011 Paper: self.20260501084010.443	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501084010.443 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 08:41	Success	-	View
exp_self.20260501083234.442_20260501_083234 Paper: self.20260501083234.442	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501083234.442 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 08:33	Success	-	View
exp_pytrain.20260501083005.110_20260501_083006 Paper: pytrain.20260501083005.110	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 08:31	Success	-	View
exp_self.20260501082302.441_20260501_082302 Paper: self.20260501082302.441	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501082302.441 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 08:24	Success	-	View
exp_self.20260501081526.440_20260501_081527 Paper: self.20260501081526.440	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501081526.440 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 08:16	Success	-	View
exp_self.20260501080747.439_20260501_080748 Paper: self.20260501080747.439	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501080747.439 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 08:08	Success	-	View
exp_self.20260501080011.438_20260501_080011 Paper: self.20260501080011.438	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501080011.438 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 08:01	Success	-	View
exp_pytrain.20260501075740.109_20260501_075740 Paper: pytrain.20260501075740.109	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 07:58	Success	-	View
exp_self.20260501075034.437_20260501_075035 Paper: self.20260501075034.437	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501075034.437 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 07:51	Success	-	View
exp_self.20260501074301.436_20260501_074301 Paper: self.20260501074301.436	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501074301.436 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 07:44	Success	-	View
exp_self.20260501073527.435_20260501_073527 Paper: self.20260501073527.435	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501073527.435 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 07:36	Success	-	View
exp_self.20260501072750.434_20260501_072751 Paper: self.20260501072750.434	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501072750.434 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 07:28	Success	-	View
exp_pytrain.20260501072519.108_20260501_072519 Paper: pytrain.20260501072519.108	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 07:26	Success	-	View
exp_self.20260501071816.433_20260501_071817 Paper: self.20260501071816.433	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501071816.433 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 07:19	Success	-	View
exp_self.20260501071046.432_20260501_071047 Paper: self.20260501071046.432	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501071046.432 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 07:11	Success	-	View
exp_self.20260501070314.431_20260501_070315 Paper: self.20260501070314.431	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501070314.431 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 07:04	Success	-	View
exp_self.20260501065536.430_20260501_065537 Paper: self.20260501065536.430	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501065536.430 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 06:56	Success	-	View
exp_pytrain.20260501065303.107_20260501_065304 Paper: pytrain.20260501065303.107	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 06:54	Success	-	View
exp_self.20260501064557.429_20260501_064558 Paper: self.20260501064557.429	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501064557.429 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 06:47	Success	-	View
exp_self.20260501063826.428_20260501_063826 Paper: self.20260501063826.428	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501063826.428 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 06:39	Success	-	View
exp_self.20260501063055.427_20260501_063055 Paper: self.20260501063055.427	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501063055.427 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 06:31	Success	-	View
exp_self.20260501062322.426_20260501_062322 Paper: self.20260501062322.426	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501062322.426 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 06:24	Success	-	View
exp_pytrain.20260501062042.106_20260501_062043 Paper: pytrain.20260501062042.106	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 06:21	Success	-	View
exp_self.20260501061339.425_20260501_061339 Paper: self.20260501061339.425	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501061339.425 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 06:14	Success	-	View
exp_self.20260501060605.424_20260501_060605 Paper: self.20260501060605.424	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501060605.424 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 06:07	Success	-	View
exp_self.20260501055833.423_20260501_055833 Paper: self.20260501055833.423	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501055833.423 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 05:59	Success	-	View
exp_self.20260501055057.422_20260501_055058 Paper: self.20260501055057.422	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501055057.422 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 05:52	Success	-	View
exp_pytrain.20260501054820.105_20260501_054820 Paper: pytrain.20260501054820.105	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 05:49	Success	-	View
exp_self.20260501054122.421_20260501_054122 Paper: self.20260501054122.421	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501054122.421 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 05:42	Success	-	View
exp_self.20260501053338.420_20260501_053338 Paper: self.20260501053338.420	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501053338.420 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 05:34	Success	-	View
exp_self.20260501052610.419_20260501_052610 Paper: self.20260501052610.419	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501052610.419 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 05:27	Success	-	View
exp_self.20260501051837.418_20260501_051838 Paper: self.20260501051837.418	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501051837.418 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 05:19	Success	-	View
exp_pytrain.20260501051600.104_20260501_051601 Paper: pytrain.20260501051600.104	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 05:17	Success	-	View
exp_self.20260501051001.417_20260501_051002 Paper: self.20260501051001.417	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501051001.417 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 05:11	Success	-	View
exp_self.20260501050223.416_20260501_050224 Paper: self.20260501050223.416	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501050223.416 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 05:03	Success	-	View
exp_hf_2604.27251_20260501_045903 Paper: hf_2604.27251	Compliance versus Sensibility: On the Reasoning Controllability in Large Language Models Paper ID: hf_2604.27251 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-01 05:00	Success	-	View
exp_self.20260501045442.415_20260501_045442 Paper: self.20260501045442.415	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501045442.415 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 04:55	Success	-	View
exp_self.20260501044707.414_20260501_044708 Paper: self.20260501044707.414	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501044707.414 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 04:48	Success	-	View
exp_pytrain.20260501044431.103_20260501_044431 Paper: pytrain.20260501044431.103	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 04:45	Success	-	View
exp_self.20260501043727.413_20260501_043728 Paper: self.20260501043727.413	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501043727.413 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 04:38	Success	-	View
exp_self.20260501042943.412_20260501_042944 Paper: self.20260501042943.412	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501042943.412 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 04:30	Success	-	View
exp_self.20260501042201.411_20260501_042201 Paper: self.20260501042201.411	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501042201.411 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 04:23	Success	-	View
exp_self.20260501041427.410_20260501_041428 Paper: self.20260501041427.410	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501041427.410 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 04:15	Success	-	View
exp_pytrain.20260501041152.102_20260501_041152 Paper: pytrain.20260501041152.102	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 04:12	Success	-	View
exp_self.20260501040457.409_20260501_040457 Paper: self.20260501040457.409	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501040457.409 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 04:06	Success	-	View
exp_self.20260501035717.408_20260501_035717 Paper: self.20260501035717.408	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501035717.408 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 03:58	Success	-	View
exp_self.20260501034936.407_20260501_034936 Paper: self.20260501034936.407	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501034936.407 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 03:50	Success	-	View
exp_self.20260501034207.406_20260501_034207 Paper: self.20260501034207.406	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501034207.406 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 03:43	Success	-	View
exp_pytrain.20260501033936.101_20260501_033936 Paper: pytrain.20260501033936.101	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 03:40	Success	-	View
exp_self.20260501033231.405_20260501_033232 Paper: self.20260501033231.405	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501033231.405 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 03:33	Success	-	View
exp_self.20260501032446.404_20260501_032446 Paper: self.20260501032446.404	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501032446.404 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 03:25	Success	-	View
exp_self.20260501031706.403_20260501_031707 Paper: self.20260501031706.403	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501031706.403 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 03:18	Success	-	View
exp_self.20260501030930.402_20260501_030930 Paper: self.20260501030930.402	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501030930.402 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 03:10	Success	-	View
exp_pytrain.20260501030701.100_20260501_030701 Paper: pytrain.20260501030701.100	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 03:08	Success	-	View
exp_self.20260501025955.401_20260501_025955 Paper: self.20260501025955.401	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501025955.401 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 03:00	Success	-	View
exp_self.20260501025222.400_20260501_025223 Paper: self.20260501025222.400	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501025222.400 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 02:53	Success	-	View
exp_self.20260501024443.399_20260501_024444 Paper: self.20260501024443.399	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501024443.399 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 02:45	Success	-	View
exp_self.20260501023709.398_20260501_023709 Paper: self.20260501023709.398	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501023709.398 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 02:38	Success	-	View
exp_pytrain.20260501023440.099_20260501_023440 Paper: pytrain.20260501023440.099	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 02:35	Success	-	View
exp_self.20260501022905.397_20260501_022905 Paper: self.20260501022905.397	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501022905.397 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 02:30	Success	-	View
exp_self.20260501022136.396_20260501_022136 Paper: self.20260501022136.396	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501022136.396 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 02:22	Success	-	View
exp_self.20260501021335.395_20260501_021336 Paper: self.20260501021335.395	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501021335.395 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 02:14	Success	-	View
exp_self.20260501020615.394_20260501_020616 Paper: self.20260501020615.394	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501020615.394 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 02:07	Success	-	View
exp_pytrain.20260501020300.098_20260501_020301 Paper: pytrain.20260501020300.098	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 02:04	Success	-	View
exp_self.20260501015628.393_20260501_015628 Paper: self.20260501015628.393	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501015628.393 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 01:57	Success	-	View
exp_self.20260501014808.392_20260501_014808 Paper: self.20260501014808.392	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501014808.392 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 01:49	Success	-	View
exp_self.20260501014104.391_20260501_014104 Paper: self.20260501014104.391	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501014104.391 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 01:42	Success	-	View
exp_self.20260501013359.390_20260501_013400 Paper: self.20260501013359.390	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501013359.390 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 01:35	Success	-	View
exp_pytrain.20260501013049.097_20260501_013050 Paper: pytrain.20260501013049.097	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 01:31	Success	-	View
exp_self.20260501012550.389_20260501_012550 Paper: self.20260501012550.389	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501012550.389 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 01:26	Success	-	View
exp_self.20260501011845.388_20260501_011845 Paper: self.20260501011845.388	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501011845.388 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 01:19	Success	-	View
exp_self.20260501011028.387_20260501_011029 Paper: self.20260501011028.387	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501011028.387 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 01:11	Success	-	View
exp_self.20260501010208.386_20260501_010208 Paper: self.20260501010208.386	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501010208.386 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 01:03	Success	-	View
exp_pytrain.20260501005851.096_20260501_005851 Paper: pytrain.20260501005851.096	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 00:59	Success	-	View
exp_self.20260501005248.385_20260501_005248 Paper: self.20260501005248.385	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501005248.385 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 00:53	Success	-	View
exp_self.20260501004506.384_20260501_004506 Paper: self.20260501004506.384	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501004506.384 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 00:46	Success	-	View
exp_self.20260501003730.383_20260501_003730 Paper: self.20260501003730.383	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501003730.383 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 00:38	Success	-	View
exp_self.20260501002958.382_20260501_002958 Paper: self.20260501002958.382	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501002958.382 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 00:31	Success	-	View
exp_pytrain.20260501002728.095_20260501_002728 Paper: pytrain.20260501002728.095	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	05-01 00:28	Success	-	View
exp_hf_2604.27039_20260501_002331 Paper: hf_2604.27039	Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling Paper ID: hf_2604.27039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-01 00:24	Success	-	View
exp_self.20260501002123.381_20260501_002123 Paper: self.20260501002123.381	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501002123.381 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 00:22	Success	-	View
exp_self.20260501001351.380_20260501_001352 Paper: self.20260501001351.380	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501001351.380 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 00:14	Success	-	View
exp_hf_2604.27085_20260501_000818 Paper: hf_2604.27085	Efficient Training on Multiple Consumer GPUs with RoundPipe Paper ID: hf_2604.27085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	05-01 00:09	Success	-	View
exp_self.20260501000612.379_20260501_000612 Paper: self.20260501000612.379	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260501000612.379 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	05-01 00:07	Success	-	View
exp_self.20260430235842.378_20260430_235842 Paper: self.20260430235842.378	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430235842.378 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 23:59	Success	-	View
exp_pytrain.20260430235607.094_20260430_235607 Paper: pytrain.20260430235607.094	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 23:57	Success	-	View
exp_self.20260430234911.377_20260430_234911 Paper: self.20260430234911.377	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430234911.377 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 23:50	Success	-	View
exp_self.20260430234139.376_20260430_234140 Paper: self.20260430234139.376	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430234139.376 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 23:42	Success	-	View
exp_self.20260430233410.375_20260430_233410 Paper: self.20260430233410.375	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430233410.375 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 23:35	Success	-	View
exp_self.20260430232640.374_20260430_232640 Paper: self.20260430232640.374	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430232640.374 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 23:27	Success	-	View
exp_pytrain.20260430232403.093_20260430_232403 Paper: pytrain.20260430232403.093	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 23:25	Success	-	View
exp_self.20260430231943.373_20260430_231943 Paper: self.20260430231943.373	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430231943.373 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 23:20	Success	-	View
exp_self.20260430231212.372_20260430_231212 Paper: self.20260430231212.372	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430231212.372 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 23:13	Success	-	View
exp_self.20260430230438.371_20260430_230439 Paper: self.20260430230438.371	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430230438.371 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 23:05	Success	-	View
exp_self.20260430225707.370_20260430_225707 Paper: self.20260430225707.370	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430225707.370 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 22:58	Success	-	View
exp_pytrain.20260430225217.092_20260430_225217 Paper: pytrain.20260430225217.092	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 22:53	Success	-	View
exp_self.20260430225012.369_20260430_225013 Paper: self.20260430225012.369	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430225012.369 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 22:51	Success	-	View
exp_self.20260430224239.368_20260430_224239 Paper: self.20260430224239.368	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430224239.368 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 22:43	Success	-	View
exp_self.20260430223508.367_20260430_223508 Paper: self.20260430223508.367	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430223508.367 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 22:36	Success	-	View
exp_self.20260430222737.366_20260430_222738 Paper: self.20260430222737.366	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430222737.366 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 22:28	Success	-	View
exp_hf_2604.27083_20260430_222418 Paper: hf_2604.27083	Co-Evolving Policy Distillation Paper ID: hf_2604.27083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 22:25	Success	-	View
exp_pytrain.20260430221953.091_20260430_221954 Paper: pytrain.20260430221953.091	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 22:20	Success	-	View
exp_self.20260430221749.365_20260430_221749 Paper: self.20260430221749.365	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430221749.365 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 22:18	Success	-	View
exp_self.20260430221014.364_20260430_221015 Paper: self.20260430221014.364	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430221014.364 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 22:11	Success	-	View
exp_self.20260430220245.363_20260430_220245 Paper: self.20260430220245.363	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430220245.363 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 22:03	Success	-	View
exp_hf_2604.28130_20260430_215946 Paper: hf_2604.28130	MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons Paper ID: hf_2604.28130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 22:00	Success	-	View
exp_self.20260430215242.362_20260430_215243 Paper: self.20260430215242.362	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430215242.362 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 21:53	Success	-	View
exp_pytrain.20260430214757.090_20260430_214757 Paper: pytrain.20260430214757.090	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 21:48	Success	-	View
exp_self.20260430214554.361_20260430_214555 Paper: self.20260430214554.361	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430214554.361 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 21:46	Success	-	View
exp_self.20260430213820.360_20260430_213820 Paper: self.20260430213820.360	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430213820.360 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 21:39	Success	-	View
exp_2604.28190v1_20260430_213249 Paper: 2604.28190v1	Representation Fréchet Loss for Visual Generation Paper ID: 2604.28190v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-30 21:33	Success	-	View
exp_self.20260430213042.359_20260430_213043 Paper: self.20260430213042.359	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430213042.359 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 21:31	Success	-	View
exp_hf_2604.28169_20260430_212721 Paper: hf_2604.28169	PhyCo: Learning Controllable Physical Priors for Generative Motion Paper ID: hf_2604.28169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 21:28	Success	-	View
exp_self.20260430212149.358_20260430_212149 Paper: self.20260430212149.358	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430212149.358 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 21:22	Success	-	View
exp_2604.28193v1_20260430_211833 Paper: 2604.28193v1	Generalizable Sparse-View 3D Reconstruction from Unconstrained Images Paper ID: 2604.28193v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-30 21:19	Success	-	View
exp_pytrain.20260430211515.089_20260430_211515 Paper: pytrain.20260430211515.089	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 21:16	Success	-	View
exp_self.20260430211208.357_20260430_211208 Paper: self.20260430211208.357	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430211208.357 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 21:13	Success	-	View
exp_hf_2604.28190_20260430_210845 Paper: hf_2604.28190	Representation Fréchet Loss for Visual Generation Paper ID: hf_2604.28190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 21:09	Success	-	View
exp_hf_2604.28185_20260430_210444 Paper: hf_2604.28185	Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper ID: hf_2604.28185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 21:05	Success	-	View
exp_self.20260430210127.356_20260430_210127 Paper: self.20260430210127.356	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430210127.356 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 21:02	Success	-	View
exp_self.20260430205355.355_20260430_205356 Paper: self.20260430205355.355	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430205355.355 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 20:54	Success	-	View
exp_self.20260430204624.354_20260430_204624 Paper: self.20260430204624.354	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430204624.354 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 20:47	Success	-	View
exp_pytrain.20260430204349.088_20260430_204349 Paper: pytrain.20260430204349.088	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 20:44	Success	-	View
exp_self.20260430203656.353_20260430_203656 Paper: self.20260430203656.353	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430203656.353 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 20:37	Success	-	View
exp_self.20260430202920.352_20260430_202920 Paper: self.20260430202920.352	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430202920.352 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 20:30	Success	-	View
exp_2604.28056v1_20260430_202603 Paper: 2604.28056v1	RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses Paper ID: 2604.28056v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-30 20:27	Success	-	View
exp_self.20260430202145.351_20260430_202145 Paper: self.20260430202145.351	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430202145.351 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 20:22	Success	-	View
exp_hf_2604.23758_20260430_201823 Paper: hf_2604.23758	Agentic Fusion of Large Atomic and Language Models to Accelerate Superconductors Discovery Paper ID: hf_2604.23758 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 20:19	Success	-	View
exp_self.20260430201407.350_20260430_201407 Paper: self.20260430201407.350	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430201407.350 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 20:15	Success	-	View
exp_pytrain.20260430201136.087_20260430_201136 Paper: pytrain.20260430201136.087	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 20:12	Success	-	View
exp_self.20260430200432.349_20260430_200432 Paper: self.20260430200432.349	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430200432.349 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 20:05	Success	-	View
exp_self.20260430195702.348_20260430_195702 Paper: self.20260430195702.348	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430195702.348 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 19:58	Success	-	View
exp_self.20260430194934.347_20260430_194934 Paper: self.20260430194934.347	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430194934.347 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 19:50	Success	-	View
exp_self.20260430194158.346_20260430_194159 Paper: self.20260430194158.346	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430194158.346 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 19:43	Success	-	View
exp_pytrain.20260430193925.086_20260430_193925 Paper: pytrain.20260430193925.086	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 19:40	Success	-	View
exp_self.20260430193223.345_20260430_193223 Paper: self.20260430193223.345	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430193223.345 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 19:33	Success	-	View
exp_self.20260430192453.344_20260430_192453 Paper: self.20260430192453.344	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430192453.344 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 19:25	Success	-	View
exp_self.20260430191723.343_20260430_191723 Paper: self.20260430191723.343	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430191723.343 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 19:18	Success	-	View
exp_self.20260430190950.342_20260430_190951 Paper: self.20260430190950.342	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430190950.342 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 19:10	Success	-	View
exp_pytrain.20260430190711.085_20260430_190711 Paper: pytrain.20260430190711.085	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 19:08	Success	-	View
exp_self.20260430190014.341_20260430_190014 Paper: self.20260430190014.341	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430190014.341 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 19:01	Success	-	View
exp_self.20260430185241.340_20260430_185242 Paper: self.20260430185241.340	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430185241.340 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 18:53	Success	-	View
exp_self.20260430184513.339_20260430_184513 Paper: self.20260430184513.339	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430184513.339 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 18:46	Success	-	View
exp_self.20260430183743.338_20260430_183743 Paper: self.20260430183743.338	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430183743.338 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 18:38	Success	-	View
exp_pytrain.20260430183509.084_20260430_183509 Paper: pytrain.20260430183509.084	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 18:36	Success	-	View
exp_self.20260430182808.337_20260430_182809 Paper: self.20260430182808.337	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430182808.337 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 18:29	Success	-	View
exp_self.20260430182037.336_20260430_182037 Paper: self.20260430182037.336	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430182037.336 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 18:21	Success	-	View
exp_self.20260430181256.335_20260430_181256 Paper: self.20260430181256.335	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430181256.335 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 18:13	Success	-	View
exp_self.20260430180529.334_20260430_180530 Paper: self.20260430180529.334	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430180529.334 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 18:06	Success	-	View
exp_pytrain.20260430180255.083_20260430_180256 Paper: pytrain.20260430180255.083	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 18:03	Success	-	View
exp_self.20260430175550.333_20260430_175551 Paper: self.20260430175550.333	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430175550.333 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 17:56	Success	-	View
exp_self.20260430174816.332_20260430_174816 Paper: self.20260430174816.332	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430174816.332 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 17:49	Success	-	View
exp_self.20260430174050.331_20260430_174051 Paper: self.20260430174050.331	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430174050.331 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 17:41	Success	-	View
exp_self.20260430173329.330_20260430_173330 Paper: self.20260430173329.330	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430173329.330 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 17:34	Success	-	View
exp_pytrain.20260430173016.082_20260430_173016 Paper: pytrain.20260430173016.082	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 17:31	Success	-	View
exp_self.20260430172442.329_20260430_172442 Paper: self.20260430172442.329	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430172442.329 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 17:25	Success	-	View
exp_self.20260430171659.328_20260430_171700 Paper: self.20260430171659.328	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430171659.328 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 17:18	Success	-	View
exp_self.20260430170915.327_20260430_170916 Paper: self.20260430170915.327	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430170915.327 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 17:10	Success	-	View
exp_self.20260430170116.326_20260430_170117 Paper: self.20260430170116.326	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430170116.326 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 17:02	Success	-	View
exp_pytrain.20260430165837.081_20260430_165838 Paper: pytrain.20260430165837.081	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 16:59	Success	-	View
exp_self.20260430165117.325_20260430_165118 Paper: self.20260430165117.325	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430165117.325 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 16:52	Success	-	View
exp_self.20260430164352.324_20260430_164353 Paper: self.20260430164352.324	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430164352.324 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 16:44	Success	-	View
exp_self.20260430163616.323_20260430_163617 Paper: self.20260430163616.323	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430163616.323 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 16:37	Success	-	View
exp_self.20260430162849.322_20260430_162849 Paper: self.20260430162849.322	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430162849.322 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 16:29	Success	-	View
exp_pytrain.20260430162619.080_20260430_162619 Paper: pytrain.20260430162619.080	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 16:27	Success	-	View
exp_self.20260430161923.321_20260430_161923 Paper: self.20260430161923.321	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430161923.321 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 16:20	Success	-	View
exp_self.20260430161152.320_20260430_161152 Paper: self.20260430161152.320	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430161152.320 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 16:12	Success	-	View
exp_self.20260430160430.319_20260430_160430 Paper: self.20260430160430.319	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430160430.319 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 16:05	Success	-	View
exp_self.20260430155704.318_20260430_155704 Paper: self.20260430155704.318	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430155704.318 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 15:58	Success	-	View
exp_pytrain.20260430155440.079_20260430_155441 Paper: pytrain.20260430155440.079	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 15:55	Success	-	View
exp_self.20260430154740.317_20260430_154740 Paper: self.20260430154740.317	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430154740.317 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 15:48	Success	-	View
exp_self.20260430154016.316_20260430_154017 Paper: self.20260430154016.316	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430154016.316 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 15:41	Success	-	View
exp_self.20260430153243.315_20260430_153243 Paper: self.20260430153243.315	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430153243.315 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 15:33	Success	-	View
exp_self.20260430152513.314_20260430_152514 Paper: self.20260430152513.314	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430152513.314 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 15:26	Success	-	View
exp_pytrain.20260430152244.078_20260430_152245 Paper: pytrain.20260430152244.078	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 15:23	Success	-	View
exp_self.20260430151542.313_20260430_151542 Paper: self.20260430151542.313	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430151542.313 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 15:16	Success	-	View
exp_self.20260430150813.312_20260430_150814 Paper: self.20260430150813.312	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430150813.312 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 15:09	Success	-	View
exp_self.20260430150045.311_20260430_150045 Paper: self.20260430150045.311	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430150045.311 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 15:01	Success	-	View
exp_self.20260430145311.310_20260430_145311 Paper: self.20260430145311.310	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430145311.310 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 14:54	Success	-	View
exp_pytrain.20260430145039.077_20260430_145040 Paper: pytrain.20260430145039.077	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 14:51	Success	-	View
exp_self.20260430144338.309_20260430_144338 Paper: self.20260430144338.309	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430144338.309 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 14:44	Success	-	View
exp_self.20260430143611.308_20260430_143611 Paper: self.20260430143611.308	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430143611.308 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 14:37	Success	-	View
exp_self.20260430142840.307_20260430_142840 Paper: self.20260430142840.307	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430142840.307 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 14:29	Success	-	View
exp_self.20260430142108.306_20260430_142109 Paper: self.20260430142108.306	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430142108.306 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 14:22	Success	-	View
exp_pytrain.20260430141841.076_20260430_141841 Paper: pytrain.20260430141841.076	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 14:19	Success	-	View
exp_self.20260430141131.305_20260430_141131 Paper: self.20260430141131.305	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430141131.305 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 14:12	Success	-	View
exp_self.20260430140341.304_20260430_140342 Paper: self.20260430140341.304	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430140341.304 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 14:04	Success	-	View
exp_self.20260430135607.303_20260430_135607 Paper: self.20260430135607.303	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430135607.303 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 13:57	Success	-	View
exp_self.20260430134839.302_20260430_134839 Paper: self.20260430134839.302	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430134839.302 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 13:49	Success	-	View
exp_pytrain.20260430134605.075_20260430_134605 Paper: pytrain.20260430134605.075	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 13:47	Success	-	View
exp_self.20260430133915.301_20260430_133916 Paper: self.20260430133915.301	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430133915.301 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 13:40	Success	-	View
exp_self.20260430133150.300_20260430_133150 Paper: self.20260430133150.300	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430133150.300 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 13:32	Success	-	View
exp_self.20260430132427.299_20260430_132428 Paper: self.20260430132427.299	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430132427.299 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 13:25	Success	-	View
exp_self.20260430131703.298_20260430_131704 Paper: self.20260430131703.298	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430131703.298 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 13:18	Success	-	View
exp_pytrain.20260430131432.074_20260430_131432 Paper: pytrain.20260430131432.074	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 13:15	Success	-	View
exp_hf_2604.23426_20260430_130929 Paper: hf_2604.23426	Enhanced Privacy and Communication Efficiency in Non-IID Federated Learning with Adaptive Quantization and Differential... Paper ID: hf_2604.23426 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 13:10	Success	-	View
exp_self.20260430130723.297_20260430_130723 Paper: self.20260430130723.297	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430130723.297 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 13:08	Success	-	View
exp_hf_2604.25135_20260430_130257 Paper: hf_2604.25135	FAMA: Failure-Aware Meta-Agentic Framework for Open-Source LLMs in Interactive Tool Use Environments Paper ID: hf_2604.25135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 13:03	Success	-	View
exp_self.20260430130021.296_20260430_130021 Paper: self.20260430130021.296	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430130021.296 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 13:01	Success	-	View
exp_hf_2604.26091_20260430_125701 Paper: hf_2604.26091	Operating-Layer Controls for Onchain Language-Model Agents Under Real Capital Paper ID: hf_2604.26091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 12:58	Success	-	View
exp_self.20260430125241.295_20260430_125241 Paper: self.20260430125241.295	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430125241.295 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 12:53	Success	-	View
exp_self.20260430124510.294_20260430_124510 Paper: self.20260430124510.294	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430124510.294 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 12:46	Success	-	View
exp_pytrain.20260430124235.073_20260430_124235 Paper: pytrain.20260430124235.073	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 12:43	Success	-	View
exp_self.20260430123524.293_20260430_123524 Paper: self.20260430123524.293	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430123524.293 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 12:36	Success	-	View
exp_self.20260430122737.292_20260430_122738 Paper: self.20260430122737.292	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430122737.292 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 12:28	Success	-	View
exp_self.20260430121957.291_20260430_121957 Paper: self.20260430121957.291	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430121957.291 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 12:21	Success	-	View
exp_self.20260430121225.290_20260430_121226 Paper: self.20260430121225.290	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430121225.290 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 12:13	Success	-	View
exp_pytrain.20260430120947.072_20260430_120947 Paper: pytrain.20260430120947.072	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 12:10	Success	-	View
exp_self.20260430120249.289_20260430_120250 Paper: self.20260430120249.289	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430120249.289 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 12:03	Success	-	View
exp_self.20260430115523.288_20260430_115523 Paper: self.20260430115523.288	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430115523.288 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 11:56	Success	-	View
exp_self.20260430114758.287_20260430_114758 Paper: self.20260430114758.287	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430114758.287 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 11:49	Success	-	View
exp_self.20260430114032.286_20260430_114032 Paper: self.20260430114032.286	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430114032.286 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 11:41	Success	-	View
exp_pytrain.20260430113801.071_20260430_113801 Paper: pytrain.20260430113801.071	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 11:39	Success	-	View
exp_self.20260430113109.285_20260430_113110 Paper: self.20260430113109.285	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430113109.285 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 11:32	Success	-	View
exp_self.20260430112332.284_20260430_112332 Paper: self.20260430112332.284	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430112332.284 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 11:24	Success	-	View
exp_self.20260430111556.283_20260430_111557 Paper: self.20260430111556.283	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430111556.283 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 11:16	Success	-	View
exp_self.20260430110827.282_20260430_110827 Paper: self.20260430110827.282	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430110827.282 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 11:09	Success	-	View
exp_pytrain.20260430110546.070_20260430_110547 Paper: pytrain.20260430110546.070	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 11:06	Success	-	View
exp_self.20260430105859.281_20260430_105900 Paper: self.20260430105859.281	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430105859.281 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 11:00	Success	-	View
exp_self.20260430105128.280_20260430_105129 Paper: self.20260430105128.280	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430105128.280 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 10:52	Success	-	View
exp_self.20260430104355.279_20260430_104356 Paper: self.20260430104355.279	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430104355.279 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 10:44	Success	-	View
exp_self.20260430103628.278_20260430_103629 Paper: self.20260430103628.278	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430103628.278 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 10:37	Success	-	View
exp_pytrain.20260430103404.069_20260430_103404 Paper: pytrain.20260430103404.069	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 10:35	Success	-	View
exp_self.20260430102703.277_20260430_102704 Paper: self.20260430102703.277	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430102703.277 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 10:28	Success	-	View
exp_self.20260430101935.276_20260430_101935 Paper: self.20260430101935.276	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430101935.276 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 10:20	Success	-	View
exp_self.20260430101202.275_20260430_101202 Paper: self.20260430101202.275	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430101202.275 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 10:13	Success	-	View
exp_self.20260430100430.274_20260430_100431 Paper: self.20260430100430.274	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430100430.274 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 10:05	Success	-	View
exp_pytrain.20260430100207.068_20260430_100208 Paper: pytrain.20260430100207.068	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 10:03	Success	-	View
exp_self.20260430095508.273_20260430_095508 Paper: self.20260430095508.273	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430095508.273 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 09:56	Success	-	View
exp_self.20260430094817.272_20260430_094818 Paper: self.20260430094817.272	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430094817.272 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 09:49	Success	-	View
exp_self.20260430094045.271_20260430_094045 Paper: self.20260430094045.271	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430094045.271 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 09:41	Success	-	View
exp_self.20260430093307.270_20260430_093309 Paper: self.20260430093307.270	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430093307.270 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 09:34	Success	-	View
exp_pytrain.20260430093028.067_20260430_093028 Paper: pytrain.20260430093028.067	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 09:31	Success	-	View
exp_self.20260430092326.269_20260430_092327 Paper: self.20260430092326.269	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430092326.269 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 09:24	Success	-	View
exp_self.20260430091555.268_20260430_091555 Paper: self.20260430091555.268	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430091555.268 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 09:16	Success	-	View
exp_self.20260430090828.267_20260430_090828 Paper: self.20260430090828.267	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430090828.267 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 09:09	Success	-	View
exp_self.20260430090058.266_20260430_090058 Paper: self.20260430090058.266	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430090058.266 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 09:02	Success	-	View
exp_pytrain.20260430085827.066_20260430_085828 Paper: pytrain.20260430085827.066	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 08:59	Success	-	View
exp_hf_2604.24351_20260430_085542 Paper: hf_2604.24351	Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion Paper ID: hf_2604.24351 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 08:56	Success	-	View
exp_self.20260430085122.265_20260430_085122 Paper: self.20260430085122.265	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430085122.265 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 08:52	Success	-	View
exp_self.20260430084352.264_20260430_084353 Paper: self.20260430084352.264	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430084352.264 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 08:44	Success	-	View
exp_self.20260430083627.263_20260430_083628 Paper: self.20260430083627.263	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430083627.263 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 08:37	Success	-	View
exp_self.20260430082843.262_20260430_082843 Paper: self.20260430082843.262	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430082843.262 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 08:29	Success	-	View
exp_pytrain.20260430082618.065_20260430_082619 Paper: pytrain.20260430082618.065	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 08:27	Success	-	View
exp_self.20260430081914.261_20260430_081914 Paper: self.20260430081914.261	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430081914.261 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 08:20	Success	-	View
exp_self.20260430081140.260_20260430_081141 Paper: self.20260430081140.260	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430081140.260 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 08:12	Success	-	View
exp_self.20260430080408.259_20260430_080408 Paper: self.20260430080408.259	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430080408.259 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 08:05	Success	-	View
exp_self.20260430075626.258_20260430_075627 Paper: self.20260430075626.258	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430075626.258 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 07:57	Success	-	View
exp_pytrain.20260430075356.064_20260430_075357 Paper: pytrain.20260430075356.064	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 07:54	Success	-	View
exp_self.20260430074824.257_20260430_074825 Paper: self.20260430074824.257	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430074824.257 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 07:49	Success	-	View
exp_self.20260430074039.256_20260430_074039 Paper: self.20260430074039.256	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430074039.256 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 07:41	Success	-	View
exp_self.20260430073251.255_20260430_073252 Paper: self.20260430073251.255	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430073251.255 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 07:33	Success	-	View
exp_self.20260430072459.254_20260430_072503 Paper: self.20260430072459.254	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430072459.254 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 07:26	Success	-	View
exp_pytrain.20260430072204.063_20260430_072204 Paper: pytrain.20260430072204.063	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 07:23	Success	-	View
exp_self.20260430071609.253_20260430_071610 Paper: self.20260430071609.253	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430071609.253 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 07:17	Success	-	View
exp_self.20260430070757.252_20260430_070757 Paper: self.20260430070757.252	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430070757.252 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 07:09	Success	-	View
exp_self.20260430070031.251_20260430_070032 Paper: self.20260430070031.251	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430070031.251 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 07:01	Success	-	View
exp_self.20260430065315.250_20260430_065315 Paper: self.20260430065315.250	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430065315.250 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 06:54	Success	-	View
exp_pytrain.20260430065011.062_20260430_065012 Paper: pytrain.20260430065011.062	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 06:51	Success	-	View
exp_self.20260430064353.249_20260430_064353 Paper: self.20260430064353.249	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430064353.249 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 06:44	Success	-	View
exp_self.20260430063630.248_20260430_063631 Paper: self.20260430063630.248	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430063630.248 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 06:37	Success	-	View
exp_self.20260430062932.247_20260430_062932 Paper: self.20260430062932.247	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430062932.247 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 06:30	Success	-	View
exp_self.20260430062239.246_20260430_062240 Paper: self.20260430062239.246	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430062239.246 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 06:23	Success	-	View
exp_pytrain.20260430061731.061_20260430_061732 Paper: pytrain.20260430061731.061	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 06:18	Success	-	View
exp_self.20260430061526.245_20260430_061527 Paper: self.20260430061526.245	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430061526.245 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 06:16	Success	-	View
exp_self.20260430060729.244_20260430_060729 Paper: self.20260430060729.244	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430060729.244 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 06:08	Success	-	View
exp_self.20260430060018.243_20260430_060018 Paper: self.20260430060018.243	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430060018.243 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 06:01	Success	-	View
exp_self.20260430055334.242_20260430_055334 Paper: self.20260430055334.242	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430055334.242 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 05:54	Success	-	View
exp_self.20260430054639.241_20260430_054639 Paper: self.20260430054639.241	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430054639.241 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 05:47	Success	-	View
exp_pytrain.20260430054404.060_20260430_054405 Paper: pytrain.20260430054404.060	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 05:45	Success	-	View
exp_self.20260430053634.240_20260430_053634 Paper: self.20260430053634.240	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430053634.240 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 05:37	Success	-	View
exp_self.20260430052914.239_20260430_052914 Paper: self.20260430052914.239	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430052914.239 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 05:30	Success	-	View
exp_self.20260430052231.238_20260430_052231 Paper: self.20260430052231.238	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430052231.238 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 05:23	Success	-	View
exp_self.20260430051516.237_20260430_051516 Paper: self.20260430051516.237	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430051516.237 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 05:16	Success	-	View
exp_pytrain.20260430051233.059_20260430_051234 Paper: pytrain.20260430051233.059	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 05:13	Success	-	View
exp_self.20260430050608.236_20260430_050608 Paper: self.20260430050608.236	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430050608.236 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 05:07	Success	-	View
exp_oa_W7157506044_20260430_050303 Paper: oa_W7157506044	Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models Paper ID: oa_W7157506044 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 05:04	Success	-	View
exp_oa_W7157506014_20260430_045847 Paper: oa_W7157506014	SpikingBrain2.0: Brain-Inspired Foundation Models for Efficient Long-Context and Cross-Platform Inference Paper ID: oa_W7157506014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 04:59	Success	-	View
exp_self.20260430045639.235_20260430_045639 Paper: self.20260430045639.235	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430045639.235 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 04:57	Success	-	View
exp_self.20260430044950.234_20260430_044951 Paper: self.20260430044950.234	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430044950.234 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 04:50	Success	-	View
exp_self.20260430044255.233_20260430_044255 Paper: self.20260430044255.233	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430044255.233 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 04:43	Success	-	View
exp_pytrain.20260430044010.058_20260430_044010 Paper: pytrain.20260430044010.058	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 04:41	Success	-	View
exp_self.20260430043335.232_20260430_043335 Paper: self.20260430043335.232	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430043335.232 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 04:34	Success	-	View
exp_self.20260430042634.231_20260430_042634 Paper: self.20260430042634.231	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430042634.231 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 04:27	Success	-	View
exp_hf_2604.24927_20260430_042145 Paper: hf_2604.24927	Large Language Models Explore by Latent Distilling Paper ID: hf_2604.24927 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-30 04:22	Success	-	View
exp_self.20260430041934.230_20260430_041935 Paper: self.20260430041934.230	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430041934.230 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 04:20	Success	-	View
exp_self.20260430041133.229_20260430_041134 Paper: self.20260430041133.229	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430041133.229 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 04:12	Success	-	View
exp_pytrain.20260430040841.057_20260430_040841 Paper: pytrain.20260430040841.057	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 04:09	Success	-	View
exp_self.20260430040223.228_20260430_040224 Paper: self.20260430040223.228	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430040223.228 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 04:03	Success	-	View
exp_self.20260430035541.227_20260430_035541 Paper: self.20260430035541.227	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430035541.227 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 03:56	Success	-	View
exp_self.20260430034848.226_20260430_034848 Paper: self.20260430034848.226	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430034848.226 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 03:49	Success	-	View
exp_self.20260430034146.225_20260430_034146 Paper: self.20260430034146.225	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430034146.225 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 03:42	Success	-	View
exp_pytrain.20260430033638.056_20260430_033639 Paper: pytrain.20260430033638.056	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 03:37	Success	-	View
exp_self.20260430033433.224_20260430_033434 Paper: self.20260430033433.224	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430033433.224 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 03:35	Success	-	View
exp_self.20260430032728.223_20260430_032728 Paper: self.20260430032728.223	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430032728.223 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 03:28	Success	-	View
exp_self.20260430032046.222_20260430_032046 Paper: self.20260430032046.222	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430032046.222 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 03:21	Success	-	View
exp_self.20260430031243.221_20260430_031243 Paper: self.20260430031243.221	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430031243.221 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 03:13	Success	-	View
exp_self.20260430030550.220_20260430_030550 Paper: self.20260430030550.220	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430030550.220 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 03:06	Success	-	View
exp_pytrain.20260430030257.055_20260430_030258 Paper: pytrain.20260430030257.055	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 03:04	Success	-	View
exp_self.20260430025642.219_20260430_025642 Paper: self.20260430025642.219	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430025642.219 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 02:57	Success	-	View
exp_self.20260430024937.218_20260430_024937 Paper: self.20260430024937.218	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430024937.218 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 02:50	Success	-	View
exp_self.20260430024245.217_20260430_024245 Paper: self.20260430024245.217	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430024245.217 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 02:43	Success	-	View
exp_self.20260430023557.216_20260430_023558 Paper: self.20260430023557.216	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430023557.216 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 02:37	Success	-	View
exp_pytrain.20260430023045.054_20260430_023045 Paper: pytrain.20260430023045.054	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 02:31	Success	-	View
exp_self.20260430022847.215_20260430_022847 Paper: self.20260430022847.215	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430022847.215 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 02:29	Success	-	View
exp_self.20260430022154.214_20260430_022155 Paper: self.20260430022154.214	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430022154.214 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 02:22	Success	-	View
exp_self.20260430021443.213_20260430_021443 Paper: self.20260430021443.213	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430021443.213 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 02:15	Success	-	View
exp_self.20260430020743.212_20260430_020744 Paper: self.20260430020743.212	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430020743.212 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 02:08	Success	-	View
exp_self.20260430020020.211_20260430_020031 Paper: self.20260430020020.211	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430020020.211 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 02:01	Success	-	View
exp_pytrain.20260430015734.053_20260430_015734 Paper: pytrain.20260430015734.053	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 01:58	Success	-	View
exp_self.20260430015125.210_20260430_015125 Paper: self.20260430015125.210	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430015125.210 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 01:52	Success	-	View
exp_self.20260430014418.209_20260430_014418 Paper: self.20260430014418.209	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430014418.209 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 01:45	Success	-	View
exp_self.20260430013706.208_20260430_013706 Paper: self.20260430013706.208	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430013706.208 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 01:38	Success	-	View
exp_self.20260430013020.207_20260430_013020 Paper: self.20260430013020.207	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430013020.207 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 01:31	Success	-	View
exp_pytrain.20260430012520.052_20260430_012520 Paper: pytrain.20260430012520.052	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 01:26	Success	-	View
exp_self.20260430012312.206_20260430_012312 Paper: self.20260430012312.206	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430012312.206 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 01:24	Success	-	View
exp_self.20260430011559.205_20260430_011600 Paper: self.20260430011559.205	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430011559.205 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 01:17	Success	-	View
exp_self.20260430010918.204_20260430_010918 Paper: self.20260430010918.204	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430010918.204 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 01:10	Success	-	View
exp_self.20260430010231.203_20260430_010231 Paper: self.20260430010231.203	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430010231.203 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 01:03	Success	-	View
exp_self.20260430005532.202_20260430_005532 Paper: self.20260430005532.202	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430005532.202 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 00:56	Success	-	View
exp_pytrain.20260430005246.051_20260430_005246 Paper: pytrain.20260430005246.051	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 00:53	Success	-	View
exp_self.20260430004622.201_20260430_004623 Paper: self.20260430004622.201	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430004622.201 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 00:47	Success	-	View
exp_self.20260430003935.200_20260430_003936 Paper: self.20260430003935.200	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430003935.200 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 00:40	Success	-	View
exp_self.20260430003206.199_20260430_003207 Paper: self.20260430003206.199	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430003206.199 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 00:33	Success	-	View
exp_self.20260430002519.198_20260430_002519 Paper: self.20260430002519.198	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430002519.198 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 00:26	Success	-	View
exp_pytrain.20260430002127.050_20260430_002127 Paper: pytrain.20260430002127.050	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-30 00:22	Success	-	View
exp_self.20260430001800.197_20260430_001801 Paper: self.20260430001800.197	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430001800.197 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 00:19	Success	-	View
exp_self.20260430001003.196_20260430_001003 Paper: self.20260430001003.196	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430001003.196 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 00:11	Success	-	View
exp_self.20260430000307.195_20260430_000307 Paper: self.20260430000307.195	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260430000307.195 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-30 00:04	Success	-	View
exp_self.20260429235507.194_20260429_235507 Paper: self.20260429235507.194	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429235507.194 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 23:56	Success	-	View
exp_pytrain.20260429234958.049_20260429_234959 Paper: pytrain.20260429234958.049	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 23:51	Success	-	View
exp_self.20260429234753.193_20260429_234754 Paper: self.20260429234753.193	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429234753.193 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 23:48	Success	-	View
exp_self.20260429234104.192_20260429_234104 Paper: self.20260429234104.192	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429234104.192 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 23:42	Success	-	View
exp_self.20260429233355.191_20260429_233356 Paper: self.20260429233355.191	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429233355.191 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 23:34	Success	-	View
exp_self.20260429232714.190_20260429_232714 Paper: self.20260429232714.190	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429232714.190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 23:28	Success	-	View
exp_self.20260429232018.189_20260429_232019 Paper: self.20260429232018.189	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429232018.189 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 23:21	Success	-	View
exp_pytrain.20260429231724.048_20260429_231725 Paper: pytrain.20260429231724.048	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 23:18	Success	-	View
exp_self.20260429231302.188_20260429_231303 Paper: self.20260429231302.188	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429231302.188 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 23:14	Success	-	View
exp_self.20260429230506.187_20260429_230506 Paper: self.20260429230506.187	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429230506.187 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 23:06	Success	-	View
exp_self.20260429225732.186_20260429_225732 Paper: self.20260429225732.186	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429225732.186 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 22:59	Success	-	View
exp_self.20260429225004.185_20260429_225004 Paper: self.20260429225004.185	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429225004.185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 22:51	Success	-	View
exp_pytrain.20260429224537.047_20260429_224538 Paper: pytrain.20260429224537.047	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 22:46	Success	-	View
exp_self.20260429224250.184_20260429_224250 Paper: self.20260429224250.184	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429224250.184 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 22:43	Success	-	View
exp_self.20260429223542.183_20260429_223542 Paper: self.20260429223542.183	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429223542.183 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 22:36	Success	-	View
exp_self.20260429222843.182_20260429_222844 Paper: self.20260429222843.182	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429222843.182 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 22:29	Success	-	View
exp_self.20260429222133.181_20260429_222133 Paper: self.20260429222133.181	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429222133.181 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 22:22	Success	-	View
exp_self.20260429221432.180_20260429_221433 Paper: self.20260429221432.180	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429221432.180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 22:15	Success	-	View
exp_pytrain.20260429221123.046_20260429_221124 Paper: pytrain.20260429221123.046	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 22:12	Success	-	View
exp_self.20260429220658.179_20260429_220659 Paper: self.20260429220658.179	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429220658.179 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 22:08	Success	-	View
exp_2604.26940v1_20260429_220337 Paper: 2604.26940v1	Select to Think: Unlocking SLM Potential with Local Sufficiency Paper ID: 2604.26940v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-29 22:04	Success	-	View
exp_self.20260429215901.178_20260429_215901 Paper: self.20260429215901.178	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429215901.178 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 22:00	Success	-	View
exp_hf_2604.26951_20260429_215532 Paper: hf_2604.26951	Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models Paper ID: hf_2604.26951 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-29 21:56	Success	-	View
exp_self.20260429215000.177_20260429_215000 Paper: self.20260429215000.177	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429215000.177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 21:51	Success	-	View
exp_2604.26951v1_20260429_214641 Paper: 2604.26951v1	Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models Paper ID: 2604.26951v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-29 21:47	Success	-	View
exp_self.20260429214213.176_20260429_214213 Paper: self.20260429214213.176	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429214213.176 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 21:43	Success	-	View
exp_pytrain.20260429213918.045_20260429_213918 Paper: pytrain.20260429213918.045	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 21:40	Success	-	View
exp_hf_2604.26779_20260429_213657 Paper: hf_2604.26779	Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding Paper ID: hf_2604.26779 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-29 21:37	Success	-	View
exp_self.20260429212939.175_20260429_212939 Paper: self.20260429212939.175	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429212939.175 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 21:30	Success	-	View
exp_self.20260429212159.174_20260429_212159 Paper: self.20260429212159.174	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429212159.174 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 21:23	Success	-	View
exp_hf_2604.26694_20260429_211854 Paper: hf_2604.26694	Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising Paper ID: hf_2604.26694 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-29 21:19	Success	-	View
exp_self.20260429211140.173_20260429_211140 Paper: self.20260429211140.173	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429211140.173 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 21:12	Success	-	View
exp_pytrain.20260429210644.044_20260429_210644 Paper: pytrain.20260429210644.044	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 21:07	Success	-	View
exp_self.20260429210423.172_20260429_210424 Paper: self.20260429210423.172	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429210423.172 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 21:05	Success	-	View
exp_2604.26868v1_20260429_210126 Paper: 2604.26868v1	Breaking the Rigid Prior: Towards Articulated 3D Anomaly Detection Paper ID: 2604.26868v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-29 21:02	Success	-	View
exp_2604.26857v1_20260429_205717 Paper: 2604.26857v1	Edge AI for Automotive Vulnerable Road User Safety: Deployable Detection via Knowledge Distillation Paper ID: 2604.26857v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-29 20:58	Success	-	View
exp_self.20260429205500.171_20260429_205500 Paper: self.20260429205500.171	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429205500.171 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 20:56	Success	-	View
exp_self.20260429204711.170_20260429_204711 Paper: self.20260429204711.170	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429204711.170 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 20:48	Success	-	View
exp_2604.26866v1_20260429_204345 Paper: 2604.26866v1	MoRFI: Monotonic Sparse Autoencoder Feature Identification Paper ID: 2604.26866v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-29 20:44	Success	-	View
exp_self.20260429203955.169_20260429_203955 Paper: self.20260429203955.169	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429203955.169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 20:40	Success	-	View
exp_pytrain.20260429203510.043_20260429_203510 Paper: pytrain.20260429203510.043	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 20:36	Success	-	View
exp_self.20260429203302.168_20260429_203303 Paper: self.20260429203302.168	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429203302.168 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 20:34	Success	-	View
exp_cr_10.22214_ijraset.2026.80728_20260429_202947 Paper: cr_10.22214_ijraset.2026.80728	ViT-YOLOv8: A Hybrid Transformer-Convolutional Model for Small Object Classification in UAV Imagery Using VisDrone Paper ID: cr_10.22214_ijraset.2026.80728 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Rec...	04-29 20:30	Success	-	View
exp_self.20260429202521.167_20260429_202522 Paper: self.20260429202521.167	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429202521.167 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 20:26	Success	-	View
exp_self.20260429201426.166_20260429_201426 Paper: self.20260429201426.166	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429201426.166 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 20:15	Success	-	View
exp_self.20260429200546.165_20260429_200547 Paper: self.20260429200546.165	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429200546.165 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 20:06	Success	-	View
exp_pytrain.20260429200309.042_20260429_200309 Paper: pytrain.20260429200309.042	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 20:04	Success	-	View
exp_self.20260429195733.164_20260429_195733 Paper: self.20260429195733.164	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429195733.164 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 19:58	Success	-	View
exp_self.20260429194949.163_20260429_194950 Paper: self.20260429194949.163	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429194949.163 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 19:50	Success	-	View
exp_self.20260429194209.162_20260429_194209 Paper: self.20260429194209.162	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429194209.162 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 19:43	Success	-	View
exp_self.20260429193427.161_20260429_193428 Paper: self.20260429193427.161	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429193427.161 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 19:35	Success	-	View
exp_pytrain.20260429193141.041_20260429_193141 Paper: pytrain.20260429193141.041	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 19:32	Success	-	View
exp_self.20260429192442.160_20260429_192443 Paper: self.20260429192442.160	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429192442.160 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 19:25	Success	-	View
exp_self.20260429191658.159_20260429_191658 Paper: self.20260429191658.159	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429191658.159 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 19:18	Success	-	View
exp_self.20260429190914.158_20260429_190914 Paper: self.20260429190914.158	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429190914.158 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 19:10	Success	-	View
exp_self.20260429190136.157_20260429_190136 Paper: self.20260429190136.157	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429190136.157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 19:02	Success	-	View
exp_pytrain.20260429185856.040_20260429_185856 Paper: pytrain.20260429185856.040	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 19:00	Success	-	View
exp_self.20260429185141.156_20260429_185142 Paper: self.20260429185141.156	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429185141.156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 18:52	Success	-	View
exp_self.20260429184358.155_20260429_184359 Paper: self.20260429184358.155	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429184358.155 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 18:45	Success	-	View
exp_self.20260429183613.154_20260429_183613 Paper: self.20260429183613.154	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429183613.154 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 18:37	Success	-	View
exp_self.20260429182851.153_20260429_182852 Paper: self.20260429182851.153	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429182851.153 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 18:29	Success	-	View
exp_pytrain.20260429182627.039_20260429_182627 Paper: pytrain.20260429182627.039	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 18:27	Success	-	View
exp_self.20260429181931.152_20260429_181931 Paper: self.20260429181931.152	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429181931.152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 18:20	Success	-	View
exp_self.20260429181202.151_20260429_181202 Paper: self.20260429181202.151	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429181202.151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 18:13	Success	-	View
exp_self.20260429180426.150_20260429_180426 Paper: self.20260429180426.150	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429180426.150 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 18:05	Success	-	View
exp_self.20260429175657.149_20260429_175658 Paper: self.20260429175657.149	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429175657.149 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 17:58	Success	-	View
exp_pytrain.20260429175429.038_20260429_175429 Paper: pytrain.20260429175429.038	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 17:55	Success	-	View
exp_self.20260429174724.148_20260429_174725 Paper: self.20260429174724.148	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429174724.148 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 17:48	Success	-	View
exp_self.20260429173954.147_20260429_173954 Paper: self.20260429173954.147	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429173954.147 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 17:40	Success	-	View
exp_self.20260429173221.146_20260429_173221 Paper: self.20260429173221.146	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429173221.146 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 17:33	Success	-	View
exp_self.20260429172456.145_20260429_172456 Paper: self.20260429172456.145	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429172456.145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 17:25	Success	-	View
exp_pytrain.20260429172232.037_20260429_172233 Paper: pytrain.20260429172232.037	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 17:23	Success	-	View
exp_self.20260429171814.144_20260429_171814 Paper: self.20260429171814.144	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429171814.144 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 17:19	Success	-	View
exp_self.20260429171047.143_20260429_171048 Paper: self.20260429171047.143	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429171047.143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 17:11	Success	-	View
exp_self.20260429170314.142_20260429_170314 Paper: self.20260429170314.142	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429170314.142 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 17:04	Success	-	View
exp_self.20260429165340.141_20260429_165341 Paper: self.20260429165340.141	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429165340.141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 16:54	Success	-	View
exp_pytrain.20260429165117.036_20260429_165118 Paper: pytrain.20260429165117.036	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 16:52	Success	-	View
exp_self.20260429164420.140_20260429_164420 Paper: self.20260429164420.140	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429164420.140 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 16:45	Success	-	View
exp_self.20260429163658.139_20260429_163658 Paper: self.20260429163658.139	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429163658.139 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 16:38	Success	-	View
exp_self.20260429162936.138_20260429_162936 Paper: self.20260429162936.138	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429162936.138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 16:30	Success	-	View
exp_self.20260429162205.137_20260429_162205 Paper: self.20260429162205.137	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429162205.137 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 16:23	Success	-	View
exp_pytrain.20260429161941.035_20260429_161941 Paper: pytrain.20260429161941.035	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 16:20	Success	-	View
exp_self.20260429161425.136_20260429_161426 Paper: self.20260429161425.136	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429161425.136 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 16:15	Success	-	View
exp_self.20260429160658.135_20260429_160658 Paper: self.20260429160658.135	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429160658.135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 16:08	Success	-	View
exp_self.20260429155936.134_20260429_155936 Paper: self.20260429155936.134	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429155936.134 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 16:00	Success	-	View
exp_self.20260429155213.133_20260429_155213 Paper: self.20260429155213.133	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429155213.133 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 15:53	Success	-	View
exp_pytrain.20260429154749.034_20260429_154749 Paper: pytrain.20260429154749.034	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 15:48	Success	-	View
exp_self.20260429154103.132_20260429_154103 Paper: self.20260429154103.132	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429154103.132 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 15:42	Success	-	View
exp_self.20260429153327.131_20260429_153327 Paper: self.20260429153327.131	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429153327.131 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 15:34	Success	-	View
exp_self.20260429152558.130_20260429_152558 Paper: self.20260429152558.130	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429152558.130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 15:27	Success	-	View
exp_self.20260429151831.129_20260429_151832 Paper: self.20260429151831.129	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429151831.129 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 15:19	Success	-	View
exp_pytrain.20260429151605.033_20260429_151605 Paper: pytrain.20260429151605.033	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 15:17	Success	-	View
exp_self.20260429150907.128_20260429_150908 Paper: self.20260429150907.128	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429150907.128 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 15:10	Success	-	View
exp_self.20260429150145.127_20260429_150145 Paper: self.20260429150145.127	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429150145.127 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 15:02	Success	-	View
exp_self.20260429145411.126_20260429_145412 Paper: self.20260429145411.126	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429145411.126 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 14:55	Success	-	View
exp_self.20260429144637.125_20260429_144637 Paper: self.20260429144637.125	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429144637.125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 14:47	Success	-	View
exp_pytrain.20260429144331.032_20260429_144331 Paper: pytrain.20260429144331.032	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 14:44	Success	-	View
exp_self.20260429143624.124_20260429_143624 Paper: self.20260429143624.124	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429143624.124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 14:37	Success	-	View
exp_self.20260429142843.123_20260429_142844 Paper: self.20260429142843.123	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429142843.123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 14:29	Success	-	View
exp_self.20260429142112.122_20260429_142112 Paper: self.20260429142112.122	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429142112.122 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 14:22	Success	-	View
exp_self.20260429141336.121_20260429_141337 Paper: self.20260429141336.121	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429141336.121 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 14:14	Success	-	View
exp_pytrain.20260429141106.031_20260429_141106 Paper: pytrain.20260429141106.031	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 14:12	Success	-	View
exp_self.20260429140403.120_20260429_140404 Paper: self.20260429140403.120	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429140403.120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 14:05	Success	-	View
exp_self.20260429135634.119_20260429_135635 Paper: self.20260429135634.119	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429135634.119 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 13:57	Success	-	View
exp_self.20260429134908.118_20260429_134908 Paper: self.20260429134908.118	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429134908.118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 13:50	Success	-	View
exp_self.20260429134131.117_20260429_134131 Paper: self.20260429134131.117	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429134131.117 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 13:42	Success	-	View
exp_pytrain.20260429133901.030_20260429_133902 Paper: pytrain.20260429133901.030	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 13:40	Success	-	View
exp_self.20260429133201.116_20260429_133201 Paper: self.20260429133201.116	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429133201.116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 13:33	Success	-	View
exp_self.20260429132430.115_20260429_132430 Paper: self.20260429132430.115	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429132430.115 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 13:25	Success	-	View
exp_self.20260429131658.114_20260429_131658 Paper: self.20260429131658.114	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429131658.114 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 13:18	Success	-	View
exp_self.20260429130922.113_20260429_130922 Paper: self.20260429130922.113	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429130922.113 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 13:10	Success	-	View
exp_pytrain.20260429130651.029_20260429_130652 Paper: pytrain.20260429130651.029	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 13:07	Success	-	View
exp_self.20260429125949.112_20260429_125949 Paper: self.20260429125949.112	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429125949.112 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 13:00	Success	-	View
exp_self.20260429125221.111_20260429_125221 Paper: self.20260429125221.111	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429125221.111 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 12:53	Success	-	View
exp_self.20260429124450.110_20260429_124450 Paper: self.20260429124450.110	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429124450.110 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 12:45	Success	-	View
exp_self.20260429123721.109_20260429_123721 Paper: self.20260429123721.109	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429123721.109 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 12:38	Success	-	View
exp_pytrain.20260429123444.028_20260429_123444 Paper: pytrain.20260429123444.028	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 12:35	Success	-	View
exp_self.20260429122743.108_20260429_122744 Paper: self.20260429122743.108	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429122743.108 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 12:28	Success	-	View
exp_self.20260429122011.107_20260429_122012 Paper: self.20260429122011.107	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429122011.107 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 12:21	Success	-	View
exp_self.20260429121240.106_20260429_121240 Paper: self.20260429121240.106	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429121240.106 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 12:13	Success	-	View
exp_self.20260429120510.105_20260429_120511 Paper: self.20260429120510.105	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429120510.105 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 12:06	Success	-	View
exp_pytrain.20260429120236.027_20260429_120236 Paper: pytrain.20260429120236.027	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 12:03	Success	-	View
exp_self.20260429115535.104_20260429_115535 Paper: self.20260429115535.104	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429115535.104 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 11:56	Success	-	View
exp_self.20260429114801.103_20260429_114801 Paper: self.20260429114801.103	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429114801.103 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 11:49	Success	-	View
exp_self.20260429114031.102_20260429_114032 Paper: self.20260429114031.102	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429114031.102 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 11:41	Success	-	View
exp_self.20260429113259.101_20260429_113300 Paper: self.20260429113259.101	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429113259.101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 11:34	Success	-	View
exp_pytrain.20260429113025.026_20260429_113026 Paper: pytrain.20260429113025.026	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 11:31	Success	-	View
exp_self.20260429112322.100_20260429_112322 Paper: self.20260429112322.100	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429112322.100 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 11:24	Success	-	View
exp_self.20260429111540.099_20260429_111540 Paper: self.20260429111540.099	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429111540.099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 11:16	Success	-	View
exp_self.20260429110758.098_20260429_110758 Paper: self.20260429110758.098	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429110758.098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 11:09	Success	-	View
exp_self.20260429110026.097_20260429_110026 Paper: self.20260429110026.097	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429110026.097 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 11:01	Success	-	View
exp_pytrain.20260429105751.025_20260429_105752 Paper: pytrain.20260429105751.025	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 10:58	Success	-	View
exp_self.20260429105213.096_20260429_105214 Paper: self.20260429105213.096	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429105213.096 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 10:53	Success	-	View
exp_self.20260429104423.095_20260429_104423 Paper: self.20260429104423.095	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429104423.095 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 10:45	Success	-	View
exp_self.20260429103635.094_20260429_103636 Paper: self.20260429103635.094	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429103635.094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 10:37	Success	-	View
exp_self.20260429102857.093_20260429_102857 Paper: self.20260429102857.093	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429102857.093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 10:29	Success	-	View
exp_pytrain.20260429102633.024_20260429_102633 Paper: pytrain.20260429102633.024	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 10:27	Success	-	View
exp_self.20260429101930.092_20260429_101931 Paper: self.20260429101930.092	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429101930.092 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 10:20	Success	-	View
exp_self.20260429101202.091_20260429_101203 Paper: self.20260429101202.091	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429101202.091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 10:13	Success	-	View
exp_self.20260429100426.090_20260429_100426 Paper: self.20260429100426.090	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429100426.090 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 10:05	Success	-	View
exp_self.20260429095643.089_20260429_095643 Paper: self.20260429095643.089	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429095643.089 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 09:57	Success	-	View
exp_pytrain.20260429095419.023_20260429_095420 Paper: pytrain.20260429095419.023	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 09:55	Success	-	View
exp_self.20260429094953.088_20260429_094954 Paper: self.20260429094953.088	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429094953.088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 09:50	Success	-	View
exp_self.20260429094225.087_20260429_094225 Paper: self.20260429094225.087	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429094225.087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 09:43	Success	-	View
exp_self.20260429093500.086_20260429_093500 Paper: self.20260429093500.086	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429093500.086 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 09:36	Success	-	View
exp_self.20260429092730.085_20260429_092730 Paper: self.20260429092730.085	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429092730.085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 09:28	Success	-	View
exp_pytrain.20260429092303.022_20260429_092303 Paper: pytrain.20260429092303.022	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 09:24	Success	-	View
exp_self.20260429091605.084_20260429_091605 Paper: self.20260429091605.084	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429091605.084 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 09:17	Success	-	View
exp_self.20260429090821.083_20260429_090822 Paper: self.20260429090821.083	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429090821.083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 09:09	Success	-	View
exp_self.20260429090037.082_20260429_090038 Paper: self.20260429090037.082	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429090037.082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 09:01	Success	-	View
exp_self.20260429085252.081_20260429_085252 Paper: self.20260429085252.081	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429085252.081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 08:53	Success	-	View
exp_pytrain.20260429085020.021_20260429_085021 Paper: pytrain.20260429085020.021	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 08:51	Success	-	View
exp_self.20260429084424.080_20260429_084424 Paper: self.20260429084424.080	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429084424.080 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 08:45	Success	-	View
exp_self.20260429083640.079_20260429_083641 Paper: self.20260429083640.079	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429083640.079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 08:37	Success	-	View
exp_self.20260429082900.078_20260429_082900 Paper: self.20260429082900.078	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429082900.078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 08:30	Success	-	View
exp_self.20260429082108.077_20260429_082108 Paper: self.20260429082108.077	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429082108.077 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 08:22	Success	-	View
exp_pytrain.20260429081836.020_20260429_081837 Paper: pytrain.20260429081836.020	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 08:19	Success	-	View
exp_self.20260429081244.076_20260429_081244 Paper: self.20260429081244.076	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429081244.076 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 08:13	Success	-	View
exp_self.20260429080451.075_20260429_080451 Paper: self.20260429080451.075	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429080451.075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 08:05	Success	-	View
exp_self.20260429075707.074_20260429_075708 Paper: self.20260429075707.074	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429075707.074 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 07:58	Success	-	View
exp_self.20260429074936.073_20260429_074936 Paper: self.20260429074936.073	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429074936.073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 07:50	Success	-	View
exp_pytrain.20260429074707.019_20260429_074707 Paper: pytrain.20260429074707.019	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 07:48	Success	-	View
exp_self.20260429074003.072_20260429_074004 Paper: self.20260429074003.072	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429074003.072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 07:41	Success	-	View
exp_self.20260429073228.071_20260429_073228 Paper: self.20260429073228.071	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429073228.071 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 07:33	Success	-	View
exp_self.20260429072449.070_20260429_072450 Paper: self.20260429072449.070	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429072449.070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 07:25	Success	-	View
exp_self.20260429071658.069_20260429_071658 Paper: self.20260429071658.069	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429071658.069 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 07:18	Success	-	View
exp_pytrain.20260429071433.018_20260429_071433 Paper: pytrain.20260429071433.018	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 07:15	Success	-	View
exp_self.20260429070734.068_20260429_070734 Paper: self.20260429070734.068	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429070734.068 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 07:08	Success	-	View
exp_self.20260429070007.067_20260429_070008 Paper: self.20260429070007.067	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429070007.067 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 07:01	Success	-	View
exp_self.20260429065244.066_20260429_065244 Paper: self.20260429065244.066	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429065244.066 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 06:53	Success	-	View
exp_self.20260429064518.065_20260429_064519 Paper: self.20260429064518.065	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429064518.065 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 06:46	Success	-	View
exp_pytrain.20260429064248.017_20260429_064248 Paper: pytrain.20260429064248.017	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 06:43	Success	-	View
exp_self.20260429063534.064_20260429_063535 Paper: self.20260429063534.064	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429063534.064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 06:36	Success	-	View
exp_self.20260429062803.063_20260429_062804 Paper: self.20260429062803.063	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429062803.063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 06:29	Success	-	View
exp_self.20260429062037.062_20260429_062038 Paper: self.20260429062037.062	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429062037.062 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 06:21	Success	-	View
exp_self.20260429061307.061_20260429_061308 Paper: self.20260429061307.061	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429061307.061 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 06:14	Success	-	View
exp_pytrain.20260429061036.016_20260429_061037 Paper: pytrain.20260429061036.016	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 06:11	Success	-	View
exp_self.20260429060341.060_20260429_060341 Paper: self.20260429060341.060	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429060341.060 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 06:04	Success	-	View
exp_self.20260429055613.059_20260429_055613 Paper: self.20260429055613.059	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429055613.059 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 05:57	Success	-	View
exp_self.20260429054846.058_20260429_054846 Paper: self.20260429054846.058	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429054846.058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 05:49	Success	-	View
exp_self.20260429054123.057_20260429_054123 Paper: self.20260429054123.057	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429054123.057 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 05:42	Success	-	View
exp_pytrain.20260429053853.015_20260429_053853 Paper: pytrain.20260429053853.015	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 05:39	Success	-	View
exp_self.20260429053206.056_20260429_053207 Paper: self.20260429053206.056	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429053206.056 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 05:33	Success	-	View
exp_self.20260429052436.055_20260429_052436 Paper: self.20260429052436.055	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429052436.055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 05:25	Success	-	View
exp_self.20260429051705.054_20260429_051706 Paper: self.20260429051705.054	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429051705.054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 05:18	Success	-	View
exp_self.20260429050937.053_20260429_050937 Paper: self.20260429050937.053	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429050937.053 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 05:10	Success	-	View
exp_pytrain.20260429050708.014_20260429_050709 Paper: pytrain.20260429050708.014	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 05:08	Success	-	View
exp_self.20260429050013.052_20260429_050013 Paper: self.20260429050013.052	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429050013.052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 05:01	Success	-	View
exp_self.20260429045238.051_20260429_045238 Paper: self.20260429045238.051	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429045238.051 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 04:53	Success	-	View
exp_self.20260429044501.050_20260429_044501 Paper: self.20260429044501.050	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429044501.050 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 04:46	Success	-	View
exp_self.20260429043731.049_20260429_043732 Paper: self.20260429043731.049	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429043731.049 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 04:38	Success	-	View
exp_pytrain.20260429043508.013_20260429_043508 Paper: pytrain.20260429043508.013	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 04:36	Success	-	View
exp_self.20260429042806.048_20260429_042806 Paper: self.20260429042806.048	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429042806.048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 04:29	Success	-	View
exp_self.20260429042039.047_20260429_042040 Paper: self.20260429042039.047	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429042039.047 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 04:21	Success	-	View
exp_self.20260429041305.046_20260429_041305 Paper: self.20260429041305.046	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429041305.046 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 04:14	Success	-	View
exp_self.20260429040535.045_20260429_040535 Paper: self.20260429040535.045	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429040535.045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 04:06	Success	-	View
exp_pytrain.20260429040313.012_20260429_040313 Paper: pytrain.20260429040313.012	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 04:04	Success	-	View
exp_self.20260429035604.044_20260429_035604 Paper: self.20260429035604.044	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429035604.044 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 03:57	Success	-	View
exp_self.20260429034834.043_20260429_034834 Paper: self.20260429034834.043	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429034834.043 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 03:49	Success	-	View
exp_self.20260429034100.042_20260429_034101 Paper: self.20260429034100.042	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429034100.042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 03:42	Success	-	View
exp_self.20260429033328.041_20260429_033329 Paper: self.20260429033328.041	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429033328.041 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 03:34	Success	-	View
exp_pytrain.20260429033106.011_20260429_033106 Paper: pytrain.20260429033106.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 03:32	Success	-	View
exp_self.20260429032403.040_20260429_032404 Paper: self.20260429032403.040	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429032403.040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 03:25	Success	-	View
exp_self.20260429031632.039_20260429_031633 Paper: self.20260429031632.039	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429031632.039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 03:17	Success	-	View
exp_self.20260429030859.038_20260429_030859 Paper: self.20260429030859.038	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429030859.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 03:10	Success	-	View
exp_self.20260429030114.037_20260429_030115 Paper: self.20260429030114.037	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429030114.037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 03:02	Success	-	View
exp_pytrain.20260429025847.010_20260429_025847 Paper: pytrain.20260429025847.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 02:59	Success	-	View
exp_self.20260429025145.036_20260429_025145 Paper: self.20260429025145.036	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429025145.036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 02:52	Success	-	View
exp_self.20260429024415.035_20260429_024415 Paper: self.20260429024415.035	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429024415.035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 02:45	Success	-	View
exp_self.20260429023642.034_20260429_023642 Paper: self.20260429023642.034	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429023642.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 02:37	Success	-	View
exp_self.20260429022907.033_20260429_022907 Paper: self.20260429022907.033	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429022907.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 02:30	Success	-	View
exp_pytrain.20260429022639.009_20260429_022640 Paper: pytrain.20260429022639.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 02:27	Success	-	View
exp_hf_2604.25719_20260429_022357 Paper: hf_2604.25719	Step-Audio-R1.5 Technical Report Paper ID: hf_2604.25719 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-29 02:24	Success	-	View
exp_self.20260429021938.032_20260429_021938 Paper: self.20260429021938.032	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429021938.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 02:20	Success	-	View
exp_self.20260429021202.031_20260429_021203 Paper: self.20260429021202.031	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429021202.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 02:13	Success	-	View
exp_self.20260429020427.030_20260429_020427 Paper: self.20260429020427.030	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429020427.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 02:05	Success	-	View
exp_self.20260429015659.029_20260429_015659 Paper: self.20260429015659.029	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429015659.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 01:58	Success	-	View
exp_pytrain.20260429015436.008_20260429_015436 Paper: pytrain.20260429015436.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 01:55	Success	-	View
exp_self.20260429014732.028_20260429_014733 Paper: self.20260429014732.028	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429014732.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 01:48	Success	-	View
exp_self.20260429014007.027_20260429_014008 Paper: self.20260429014007.027	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429014007.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 01:41	Success	-	View
exp_self.20260429013231.026_20260429_013232 Paper: self.20260429013231.026	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429013231.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 01:33	Success	-	View
exp_self.20260429012500.025_20260429_012500 Paper: self.20260429012500.025	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429012500.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 01:26	Success	-	View
exp_pytrain.20260429012238.007_20260429_012238 Paper: pytrain.20260429012238.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 01:23	Success	-	View
exp_self.20260429011538.024_20260429_011539 Paper: self.20260429011538.024	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429011538.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 01:16	Success	-	View
exp_self.20260429010814.023_20260429_010814 Paper: self.20260429010814.023	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429010814.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 01:09	Success	-	View
exp_self.20260429010032.022_20260429_010033 Paper: self.20260429010032.022	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429010032.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 01:01	Success	-	View
exp_self.20260429005303.021_20260429_005304 Paper: self.20260429005303.021	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429005303.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 00:54	Success	-	View
exp_pytrain.20260429005041.006_20260429_005041 Paper: pytrain.20260429005041.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 00:51	Success	-	View
exp_self.20260429004625.020_20260429_004626 Paper: self.20260429004625.020	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429004625.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 00:47	Success	-	View
exp_self.20260429003900.019_20260429_003900 Paper: self.20260429003900.019	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429003900.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 00:40	Success	-	View
exp_gh_burgerkhan6227_tokenWise-Optimizer_20260429_003437 Paper: gh_burgerkhan6227_tokenWise-Optimizer	burgerkhan6227/tokenWise-Optimizer Paper ID: gh_burgerkhan6227_tokenWise-Optimizer - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Sign...	04-29 00:35	Success	-	View
exp_self.20260429003127.018_20260429_003127 Paper: self.20260429003127.018	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429003127.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 00:32	Success	-	View
exp_cr_10.30574_ijsra.2026.19.1.0697_20260429_002812 Paper: cr_10.30574_ijsra.2026.19.1.0697	Formation and efficiency analysis of an innovative business model in automotive engineering based on the principles of o... Paper ID: cr_10.30574_ijsra.2026.19.1.0697 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: R...	04-29 00:29	Success	-	View
exp_self.20260429002249.017_20260429_002249 Paper: self.20260429002249.017	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429002249.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 00:23	Success	-	View
exp_pytrain.20260429001912.005_20260429_001912 Paper: pytrain.20260429001912.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-29 00:20	Success	-	View
exp_self.20260429001458.016_20260429_001458 Paper: self.20260429001458.016	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429001458.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 00:16	Success	-	View
exp_cr_10.65196_7a1sxq95_20260429_001208 Paper: cr_10.65196_7a1sxq95	<b>量子机器学习在大模型训练加速中的应用探索</b><b></b> Paper ID: cr_10.65196_7a1sxq95 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered ben...	04-29 00:13	Success	-	View
exp_self.20260429000505.015_20260429_000506 Paper: self.20260429000505.015	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260429000505.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-29 00:06	Success	-	View
exp_cr_10.22214_ijraset.2026.79880_20260429_000155 Paper: cr_10.22214_ijraset.2026.79880	Design and Evaluation of a Smartphone Application for Early Atopic Dermatitis Screening Using Large Language Model Paper ID: cr_10.22214_ijraset.2026.79880 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Rec...	04-29 00:02	Success	-	View
exp_self.20260428235733.014_20260428_235733 Paper: self.20260428235733.014	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428235733.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 23:58	Success	-	View
exp_self.20260428235009.013_20260428_235009 Paper: self.20260428235009.013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428235009.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 23:51	Success	-	View
exp_pytrain.20260428234746.004_20260428_234746 Paper: pytrain.20260428234746.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 23:48	Success	-	View
exp_self.20260428234051.012_20260428_234052 Paper: self.20260428234051.012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428234051.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 23:41	Success	-	View
exp_self.20260428233325.011_20260428_233326 Paper: self.20260428233325.011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428233325.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 23:34	Success	-	View
exp_self.20260428232600.010_20260428_232600 Paper: self.20260428232600.010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428232600.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 23:27	Success	-	View
exp_self.20260428231823.009_20260428_231824 Paper: self.20260428231823.009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428231823.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 23:19	Success	-	View
exp_pytrain.20260428231559.003_20260428_231600 Paper: pytrain.20260428231559.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 23:17	Success	-	View
exp_self.20260428230859.008_20260428_230859 Paper: self.20260428230859.008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428230859.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 23:10	Success	-	View
exp_self.20260428230131.007_20260428_230132 Paper: self.20260428230131.007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428230131.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 23:02	Success	-	View
exp_self.20260428225412.006_20260428_225412 Paper: self.20260428225412.006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428225412.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 22:55	Success	-	View
exp_self.20260428224643.005_20260428_224643 Paper: self.20260428224643.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428224643.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 22:47	Success	-	View
exp_pytrain.20260428224416.002_20260428_224416 Paper: pytrain.20260428224416.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 22:45	Success	-	View
exp_self.20260428223721.004_20260428_223721 Paper: self.20260428223721.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428223721.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 22:38	Success	-	View
exp_self.20260428222954.003_20260428_222954 Paper: self.20260428222954.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428222954.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 22:30	Success	-	View
exp_self.20260428222228.002_20260428_222229 Paper: self.20260428222228.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428222228.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 22:23	Success	-	View
exp_hf_2604.23941_20260428_221910 Paper: hf_2604.23941	GoClick: Lightweight Element Grounding Model for Autonomous GUI Interaction Paper ID: hf_2604.23941 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 22:20	Success	-	View
exp_self.20260428221455.001_20260428_221455 Paper: self.20260428221455.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428221455.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 22:15	Success	-	View
exp_pytrain.20260428221232.001_20260428_221233 Paper: pytrain.20260428221232.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 22:13	Success	-	View
exp_self.20260428220844.040_20260428_220844 Paper: self.20260428220844.040	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428220844.040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 22:09	Success	-	View
exp_pytrain.20260428220612.011_20260428_220612 Paper: pytrain.20260428220612.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 22:07	Success	-	View
exp_2604.25902v1_20260428_220326 Paper: 2604.25902v1	Toward a Functional Geometric Algebra for Natural Language Semantics Paper ID: 2604.25902v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-28 22:04	Success	-	View
exp_2604.25917v1_20260428_215820 Paper: 2604.25917v1	Recursive Multi-Agent Systems Paper ID: 2604.25917v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-28 21:59	Success	-	View
exp_self.20260428215609.039_20260428_215609 Paper: self.20260428215609.039	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428215609.039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 21:57	Success	-	View
exp_2604.25903v1_20260428_215256 Paper: 2604.25903v1	Carbon-Taxed Transformers: A Green Compression Pipeline for Overgrown Language Models Paper ID: 2604.25903v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-28 21:53	Success	-	View
exp_self.20260428214720.038_20260428_214721 Paper: self.20260428214720.038	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428214720.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 21:48	Success	-	View
exp_self.20260428213945.037_20260428_213945 Paper: self.20260428213945.037	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428213945.037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 21:40	Success	-	View
exp_hf_2604.25203_20260428_213651 Paper: hf_2604.25203	BARRED: Synthetic Training of Custom Policy Guardrails via Asymmetric Debate Paper ID: hf_2604.25203 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 21:37	Success	-	View
exp_pytrain.20260428213447.010_20260428_213448 Paper: pytrain.20260428213447.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 21:35	Success	-	View
exp_hf_2604.25819_20260428_212948 Paper: hf_2604.25819	Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation Paper ID: hf_2604.25819 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 21:30	Success	-	View
exp_self.20260428212744.036_20260428_212745 Paper: self.20260428212744.036	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428212744.036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 21:28	Success	-	View
exp_hf_2604.25427_20260428_212208 Paper: hf_2604.25427	A Systematic Post-Train Framework for Video Generation Paper ID: hf_2604.25427 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 21:23	Success	-	View
exp_self.20260428212004.035_20260428_212004 Paper: self.20260428212004.035	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428212004.035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 21:21	Success	-	View
exp_self.20260428211233.034_20260428_211234 Paper: self.20260428211233.034	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428211233.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 21:13	Success	-	View
exp_self.20260428210458.033_20260428_210458 Paper: self.20260428210458.033	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428210458.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 21:06	Success	-	View
exp_pytrain.20260428210228.009_20260428_210228 Paper: pytrain.20260428210228.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 21:03	Success	-	View
exp_2604.25740v1_20260428_205727 Paper: 2604.25740v1	QAROO: AI-Driven Online Task Offloading for Energy-Efficient and Sustainable MEC Networks Paper ID: 2604.25740v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-28 20:58	Success	-	View
exp_self.20260428205522.032_20260428_205522 Paper: self.20260428205522.032	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428205522.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 20:56	Success	-	View
exp_2604.25774v1_20260428_205056 Paper: 2604.25774v1	CGU-ILALab at FoodBench-QA 2026: Comparing Traditional and LLM-based Approaches for Recipe Nutrient Estimation Paper ID: 2604.25774v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-28 20:51	Success	-	View
exp_self.20260428204740.031_20260428_204741 Paper: self.20260428204740.031	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428204740.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 20:48	Success	-	View
exp_hf_2604.25917_20260428_204421 Paper: hf_2604.25917	Recursive Multi-Agent Systems Paper ID: hf_2604.25917 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 20:45	Success	-	View
exp_self.20260428203959.030_20260428_204000 Paper: self.20260428203959.030	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428203959.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 20:41	Success	-	View
exp_self.20260428203229.029_20260428_203229 Paper: self.20260428203229.029	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428203229.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 20:33	Success	-	View
exp_pytrain.20260428202954.008_20260428_202954 Paper: pytrain.20260428202954.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 20:30	Success	-	View
exp_hf_2604.18756_20260428_202707 Paper: hf_2604.18756	Towards Understanding the Robustness of Sparse Autoencoders Paper ID: hf_2604.18756 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 20:28	Success	-	View
exp_self.20260428202141.028_20260428_202141 Paper: self.20260428202141.028	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428202141.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 20:22	Success	-	View
exp_self.20260428201403.027_20260428_201404 Paper: self.20260428201403.027	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428201403.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 20:15	Success	-	View
exp_self.20260428200633.026_20260428_200633 Paper: self.20260428200633.026	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428200633.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 20:07	Success	-	View
exp_self.20260428195907.025_20260428_195907 Paper: self.20260428195907.025	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428195907.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 20:00	Success	-	View
exp_pytrain.20260428195633.007_20260428_195633 Paper: pytrain.20260428195633.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 19:57	Success	-	View
exp_self.20260428194943.024_20260428_194943 Paper: self.20260428194943.024	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428194943.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 19:50	Success	-	View
exp_self.20260428194241.023_20260428_194242 Paper: self.20260428194241.023	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428194241.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 19:43	Success	-	View
exp_self.20260428193456.022_20260428_193457 Paper: self.20260428193456.022	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428193456.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 19:35	Success	-	View
exp_self.20260428192730.021_20260428_192730 Paper: self.20260428192730.021	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428192730.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 19:28	Success	-	View
exp_pytrain.20260428192459.006_20260428_192459 Paper: pytrain.20260428192459.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 19:26	Success	-	View
exp_self.20260428191815.020_20260428_191815 Paper: self.20260428191815.020	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428191815.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 19:19	Success	-	View
exp_self.20260428191047.019_20260428_191048 Paper: self.20260428191047.019	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428191047.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 19:11	Success	-	View
exp_self.20260428190326.018_20260428_190326 Paper: self.20260428190326.018	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428190326.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 19:04	Success	-	View
exp_self.20260428185559.017_20260428_185600 Paper: self.20260428185559.017	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428185559.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 18:57	Success	-	View
exp_pytrain.20260428185327.005_20260428_185328 Paper: pytrain.20260428185327.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 18:54	Success	-	View
exp_self.20260428184700.016_20260428_184700 Paper: self.20260428184700.016	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428184700.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 18:48	Success	-	View
exp_self.20260428184004.015_20260428_184004 Paper: self.20260428184004.015	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428184004.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 18:41	Success	-	View
exp_self.20260428183153.014_20260428_183154 Paper: self.20260428183153.014	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428183153.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 18:32	Success	-	View
exp_self.20260428182343.013_20260428_182344 Paper: self.20260428182343.013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428182343.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 18:24	Success	-	View
exp_pytrain.20260428182039.004_20260428_182039 Paper: pytrain.20260428182039.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 18:21	Success	-	View
exp_self.20260428181406.012_20260428_181406 Paper: self.20260428181406.012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428181406.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 18:15	Success	-	View
exp_self.20260428180552.011_20260428_180553 Paper: self.20260428180552.011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428180552.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 18:06	Success	-	View
exp_self.20260428175850.010_20260428_175851 Paper: self.20260428175850.010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428175850.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 17:59	Success	-	View
exp_self.20260428175152.009_20260428_175153 Paper: self.20260428175152.009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428175152.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 17:52	Success	-	View
exp_pytrain.20260428174844.003_20260428_174845 Paper: pytrain.20260428174844.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 17:49	Success	-	View
exp_gh_Rangle2_mda_20260428_174430 Paper: gh_Rangle2_mda	Rangle2/mda Paper ID: gh_Rangle2_mda - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 17:45	Success	-	View
exp_self.20260428174142.008_20260428_174143 Paper: self.20260428174142.008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428174142.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 17:42	Success	-	View
exp_self.20260428173330.007_20260428_173330 Paper: self.20260428173330.007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428173330.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 17:34	Success	-	View
exp_self.20260428172558.006_20260428_172559 Paper: self.20260428172558.006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428172558.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 17:27	Success	-	View
exp_self.20260428171832.005_20260428_171832 Paper: self.20260428171832.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428171832.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 17:19	Success	-	View
exp_pytrain.20260428171608.002_20260428_171608 Paper: pytrain.20260428171608.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 17:17	Success	-	View
exp_self.20260428170918.004_20260428_170919 Paper: self.20260428170918.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428170918.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 17:10	Success	-	View
exp_self.20260428170158.003_20260428_170158 Paper: self.20260428170158.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428170158.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 17:03	Success	-	View
exp_self.20260428165437.002_20260428_165438 Paper: self.20260428165437.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428165437.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 16:55	Success	-	View
exp_hf_2604.15574_20260428_165123 Paper: hf_2604.15574	Why Fine-Tuning Encourages Hallucinations and How to Fix It Paper ID: hf_2604.15574 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 16:52	Success	-	View
exp_self.20260428164709.001_20260428_164709 Paper: self.20260428164709.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428164709.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 16:48	Success	-	View
exp_pytrain.20260428164446.001_20260428_164446 Paper: pytrain.20260428164446.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 16:45	Success	-	View
exp_self.20260428162713.043_20260428_162714 Paper: self.20260428162713.043	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428162713.043 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 16:38	Pending	-	View
exp_pytrain.20260428162422.016_20260428_162422 Paper: pytrain.20260428162422.016	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 16:25	Success	-	View
exp_hf_2604.24040_20260428_160049 Paper: hf_2604.24040	Improving Robustness of Tabular Retrieval via Representational Stability Paper ID: hf_2604.24040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 16:22	Failed	NameError: name 'D_MODEL' is not defined	View
exp_self.20260428153717.042_20260428_153717 Paper: self.20260428153717.042	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428153717.042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 15:59	Failed	NameError: name 'D_MODEL' is not defined	View
exp_pytrain.20260428153326.015_20260428_153327 Paper: pytrain.20260428153326.015	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 15:35	Success	-	View
exp_hf_2604.21681_20260428_150852 Paper: hf_2604.21681	Sapiens2 Paper ID: hf_2604.21681 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 15:31	Failed	NameError: name 'D_MODEL' is not defined	View
exp_self.20260428144447.041_20260428_144447 Paper: self.20260428144447.041	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428144447.041 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 15:06	Failed	NameError: name 'D_MODEL' is not defined	View
exp_pytrain.20260428144204.014_20260428_144205 Paper: pytrain.20260428144204.014	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 14:43	Success	-	View
exp_self.20260428141318.040_20260428_141319 Paper: self.20260428141318.040	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428141318.040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 14:36	Failed	NameError: name 'D_MODEL' is not defined	View
exp_pytrain.20260428140827.013_20260428_140827 Paper: pytrain.20260428140827.013	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 14:11	Success	-	View
exp_self.20260428134113.039_20260428_134113 Paper: self.20260428134113.039	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428134113.039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 14:03	Failed	NameError: name 'D_MODEL' is not defined	View
exp_pytrain.20260428133611.012_20260428_133611 Paper: pytrain.20260428133611.012	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 13:37	Success	-	View
exp_self.20260428131317.038_20260428_131317 Paper: self.20260428131317.038	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428131317.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 13:35	Failed	NameError: name 'D_MODEL' is not defined	View
exp_self.20260428124536.037_20260428_124536 Paper: self.20260428124536.037	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428124536.037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 13:07	Failed	NameError: name 'D_MODEL' is not defined	View
exp_pytrain.20260428124255.011_20260428_124255 Paper: pytrain.20260428124255.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 12:44	Success	-	View
exp_self.20260428121459.036_20260428_121459 Paper: self.20260428121459.036	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428121459.036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 12:36	Failed	NameError: name 'D_MODEL' is not defined	View
exp_pytrain.20260428115034.010_20260428_115034 Paper: pytrain.20260428115034.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 12:11	Failed	Timeout while waiting for process shutdown	View
exp_self.20260428114309.035_20260428_114310 Paper: self.20260428114309.035	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428114309.035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 11:44	Success	-	View
exp_self.20260428113542.034_20260428_113542 Paper: self.20260428113542.034	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428113542.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 11:36	Success	-	View
exp_self.20260428112759.033_20260428_112759 Paper: self.20260428112759.033	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428112759.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 11:29	Success	-	View
exp_self.20260428112052.032_20260428_112053 Paper: self.20260428112052.032	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428112052.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 11:22	Success	-	View
exp_pytrain.20260428111753.009_20260428_111753 Paper: pytrain.20260428111753.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 11:18	Success	-	View
exp_self.20260428111112.031_20260428_111112 Paper: self.20260428111112.031	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428111112.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 11:12	Success	-	View
exp_self.20260428110353.030_20260428_110353 Paper: self.20260428110353.030	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428110353.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 11:05	Success	-	View
exp_self.20260428105632.029_20260428_105633 Paper: self.20260428105632.029	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428105632.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 10:57	Success	-	View
exp_self.20260428104908.028_20260428_104908 Paper: self.20260428104908.028	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428104908.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 10:50	Success	-	View
exp_pytrain.20260428104549.008_20260428_104549 Paper: pytrain.20260428104549.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 10:46	Success	-	View
exp_self.20260428103915.027_20260428_103915 Paper: self.20260428103915.027	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428103915.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 10:40	Success	-	View
exp_self.20260428103147.026_20260428_103147 Paper: self.20260428103147.026	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428103147.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 10:32	Success	-	View
exp_self.20260428102407.025_20260428_102407 Paper: self.20260428102407.025	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428102407.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 10:25	Success	-	View
exp_self.20260428101639.024_20260428_101639 Paper: self.20260428101639.024	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428101639.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 10:17	Success	-	View
exp_pytrain.20260428101327.007_20260428_101328 Paper: pytrain.20260428101327.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 10:14	Success	-	View
exp_self.20260428100834.023_20260428_100834 Paper: self.20260428100834.023	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428100834.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 10:09	Success	-	View
exp_self.20260428100108.022_20260428_100109 Paper: self.20260428100108.022	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428100108.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 10:02	Success	-	View
exp_self.20260428095339.021_20260428_095339 Paper: self.20260428095339.021	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428095339.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 09:54	Success	-	View
exp_self.20260428094609.020_20260428_094610 Paper: self.20260428094609.020	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428094609.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 09:47	Success	-	View
exp_pytrain.20260428094056.006_20260428_094056 Paper: pytrain.20260428094056.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 09:42	Success	-	View
exp_self.20260428093607.019_20260428_093608 Paper: self.20260428093607.019	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428093607.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 09:38	Success	-	View
exp_self.20260428092517.018_20260428_092518 Paper: self.20260428092517.018	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428092517.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 09:28	Success	-	View
exp_self.20260428091435.017_20260428_091435 Paper: self.20260428091435.017	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428091435.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 09:17	Success	-	View
exp_pytrain.20260428090851.005_20260428_090851 Paper: pytrain.20260428090851.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 09:10	Success	-	View
exp_self.20260428090628.016_20260428_090628 Paper: self.20260428090628.016	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428090628.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 09:07	Success	-	View
exp_self.20260428085820.015_20260428_085821 Paper: self.20260428085820.015	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428085820.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 08:59	Success	-	View
exp_self.20260428085012.014_20260428_085012 Paper: self.20260428085012.014	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428085012.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 08:51	Success	-	View
exp_self.20260428084220.013_20260428_084221 Paper: self.20260428084220.013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428084220.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 08:43	Success	-	View
exp_hf_2604.23644_20260428_083911 Paper: hf_2604.23644	RaV-IDP: A Reconstruction-as-Validation Framework for Faithful Intelligent Document Processing Paper ID: hf_2604.23644 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 08:40	Success	-	View
exp_pytrain.20260428083651.004_20260428_083652 Paper: pytrain.20260428083651.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 08:37	Success	-	View
exp_self.20260428083015.012_20260428_083015 Paper: self.20260428083015.012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428083015.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 08:31	Success	-	View
exp_hf_2604.17565_20260428_082520 Paper: hf_2604.17565	UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models Paper ID: hf_2604.17565 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 08:26	Success	-	View
exp_self.20260428082259.011_20260428_082300 Paper: self.20260428082259.011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428082259.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 08:24	Success	-	View
exp_self.20260428081528.010_20260428_081529 Paper: self.20260428081528.010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428081528.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 08:16	Success	-	View
exp_self.20260428080802.009_20260428_080802 Paper: self.20260428080802.009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428080802.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 08:09	Success	-	View
exp_pytrain.20260428080505.003_20260428_080505 Paper: pytrain.20260428080505.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 08:06	Success	-	View
exp_self.20260428080054.008_20260428_080055 Paper: self.20260428080054.008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428080054.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 08:01	Success	-	View
exp_self.20260428075312.007_20260428_075312 Paper: self.20260428075312.007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428075312.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 07:54	Success	-	View
exp_self.20260428074536.006_20260428_074536 Paper: self.20260428074536.006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428074536.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 07:46	Success	-	View
exp_self.20260428073753.005_20260428_073754 Paper: self.20260428073753.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428073753.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 07:39	Success	-	View
exp_pytrain.20260428073321.002_20260428_073321 Paper: pytrain.20260428073321.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 07:34	Success	-	View
exp_self.20260428073051.004_20260428_073051 Paper: self.20260428073051.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428073051.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 07:31	Success	-	View
exp_hf_2604.22842_20260428_072732 Paper: hf_2604.22842	EX-FIQA: Leveraging Intermediate Early eXit Representations from Vision Transformers for Face Image Quality Assessment Paper ID: hf_2604.22842 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 07:28	Success	-	View
exp_self.20260428072013.003_20260428_072013 Paper: self.20260428072013.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428072013.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 07:21	Success	-	View
exp_self.20260428071235.002_20260428_071235 Paper: self.20260428071235.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428071235.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 07:13	Success	-	View
exp_self.20260428070411.001_20260428_070412 Paper: self.20260428070411.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428070411.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 07:05	Success	-	View
exp_pytrain.20260428070115.001_20260428_070116 Paper: pytrain.20260428070115.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 07:02	Success	-	View
exp_self.20260428035110.271_20260428_035111 Paper: self.20260428035110.271	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428035110.271 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 03:51	Pending	-	View
exp_self.20260428034338.270_20260428_034338 Paper: self.20260428034338.270	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428034338.270 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 03:44	Success	-	View
exp_pytrain.20260428034034.066_20260428_034034 Paper: pytrain.20260428034034.066	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 03:41	Success	-	View
exp_self.20260428033356.269_20260428_033356 Paper: self.20260428033356.269	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428033356.269 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 03:35	Success	-	View
exp_self.20260428032633.268_20260428_032633 Paper: self.20260428032633.268	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428032633.268 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 03:27	Success	-	View
exp_self.20260428031904.267_20260428_031904 Paper: self.20260428031904.267	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428031904.267 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 03:20	Success	-	View
exp_self.20260428031129.266_20260428_031129 Paper: self.20260428031129.266	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428031129.266 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 03:12	Success	-	View
exp_pytrain.20260428030825.065_20260428_030825 Paper: pytrain.20260428030825.065	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 03:09	Success	-	View
exp_self.20260428030126.265_20260428_030126 Paper: self.20260428030126.265	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428030126.265 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 03:02	Success	-	View
exp_hf_2604.22841_20260428_025625 Paper: hf_2604.22841	ATTN-FIQA: Interpretable Attention-based Face Image Quality Assessment with Vision Transformers Paper ID: hf_2604.22841 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 02:57	Success	-	View
exp_self.20260428025358.264_20260428_025358 Paper: self.20260428025358.264	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428025358.264 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 02:55	Success	-	View
exp_hf_2604.23210_20260428_024857 Paper: hf_2604.23210	Discovering Agentic Safety Specifications from 1-Bit Danger Signals Paper ID: hf_2604.23210 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 02:50	Success	-	View
exp_self.20260428024612.263_20260428_024612 Paper: self.20260428024612.263	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428024612.263 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 02:47	Success	-	View
exp_self.20260428023851.262_20260428_023852 Paper: self.20260428023851.262	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428023851.262 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 02:40	Success	-	View
exp_pytrain.20260428023535.064_20260428_023535 Paper: pytrain.20260428023535.064	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 02:36	Success	-	View
exp_self.20260428023133.261_20260428_023133 Paper: self.20260428023133.261	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428023133.261 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 02:32	Success	-	View
exp_self.20260428022418.260_20260428_022418 Paper: self.20260428022418.260	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428022418.260 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 02:25	Success	-	View
exp_self.20260428021640.259_20260428_021640 Paper: self.20260428021640.259	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428021640.259 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 02:17	Success	-	View
exp_self.20260428020922.258_20260428_020922 Paper: self.20260428020922.258	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428020922.258 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 02:10	Success	-	View
exp_pytrain.20260428020352.063_20260428_020353 Paper: pytrain.20260428020352.063	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 02:05	Success	-	View
exp_self.20260428020128.257_20260428_020129 Paper: self.20260428020128.257	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428020128.257 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 02:02	Success	-	View
exp_self.20260428015354.256_20260428_015355 Paper: self.20260428015354.256	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428015354.256 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 01:55	Success	-	View
exp_self.20260428014625.255_20260428_014625 Paper: self.20260428014625.255	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428014625.255 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 01:47	Success	-	View
exp_cr_10.3389_fchem.2026.1834317_20260428_014316 Paper: cr_10.3389_fchem.2026.1834317	CS-DTA: a language model-driven framework for robust drug-target affinity prediction under strict cold-start scenarios Paper ID: cr_10.3389_fchem.2026.1834317 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-28 01:44	Success	-	View
exp_hf_2508.10180_20260428_013945 Paper: hf_2508.10180	For-Value: Efficient Forward-Only Data Valuation for finetuning LLMs and VLMs Paper ID: hf_2508.10180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 01:40	Success	-	View
exp_self.20260428013438.254_20260428_013438 Paper: self.20260428013438.254	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428013438.254 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 01:35	Success	-	View
exp_pytrain.20260428013147.062_20260428_013148 Paper: pytrain.20260428013147.062	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 01:32	Success	-	View
exp_self.20260428012549.253_20260428_012549 Paper: self.20260428012549.253	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428012549.253 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 01:26	Success	-	View
exp_self.20260428011823.252_20260428_011823 Paper: self.20260428011823.252	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428011823.252 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 01:19	Success	-	View
exp_self.20260428011105.251_20260428_011106 Paper: self.20260428011105.251	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428011105.251 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 01:12	Success	-	View
exp_self.20260428010321.250_20260428_010321 Paper: self.20260428010321.250	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428010321.250 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 01:04	Success	-	View
exp_pytrain.20260428010017.061_20260428_010017 Paper: pytrain.20260428010017.061	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 01:01	Success	-	View
exp_self.20260428005332.249_20260428_005332 Paper: self.20260428005332.249	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428005332.249 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 00:54	Success	-	View
exp_self.20260428004606.248_20260428_004607 Paper: self.20260428004606.248	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428004606.248 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 00:47	Success	-	View
exp_self.20260428003851.247_20260428_003851 Paper: self.20260428003851.247	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428003851.247 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 00:39	Success	-	View
exp_self.20260428003112.246_20260428_003112 Paper: self.20260428003112.246	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428003112.246 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 00:32	Success	-	View
exp_pytrain.20260428002802.060_20260428_002803 Paper: pytrain.20260428002802.060	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-28 00:29	Success	-	View
exp_hf_2604.21480_20260428_002454 Paper: hf_2604.21480	Efficient Agent Evaluation via Diversity-Guided User Simulation Paper ID: hf_2604.21480 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-28 00:26	Success	-	View
exp_self.20260428002113.245_20260428_002114 Paper: self.20260428002113.245	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428002113.245 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 00:22	Success	-	View
exp_self.20260428001403.244_20260428_001404 Paper: self.20260428001403.244	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428001403.244 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 00:15	Success	-	View
exp_self.20260428000617.243_20260428_000618 Paper: self.20260428000617.243	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260428000617.243 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-28 00:07	Success	-	View
exp_self.20260427235833.242_20260427_235834 Paper: self.20260427235833.242	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427235833.242 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 23:59	Success	-	View
exp_pytrain.20260427235539.059_20260427_235539 Paper: pytrain.20260427235539.059	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 23:56	Success	-	View
exp_self.20260427234852.241_20260427_234853 Paper: self.20260427234852.241	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427234852.241 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 23:49	Success	-	View
exp_self.20260427234117.240_20260427_234118 Paper: self.20260427234117.240	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427234117.240 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 23:42	Success	-	View
exp_hf_2604.23775_20260427_233739 Paper: hf_2604.23775	Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms Paper ID: hf_2604.23775 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-27 23:38	Success	-	View
exp_self.20260427233359.239_20260427_233359 Paper: self.20260427233359.239	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427233359.239 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 23:35	Success	-	View
exp_self.20260427232656.238_20260427_232656 Paper: self.20260427232656.238	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427232656.238 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 23:27	Success	-	View
exp_pytrain.20260427232358.058_20260427_232359 Paper: pytrain.20260427232358.058	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 23:25	Success	-	View
exp_self.20260427231909.237_20260427_231910 Paper: self.20260427231909.237	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427231909.237 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 23:20	Success	-	View
exp_hf_2604.24300_20260427_231404 Paper: hf_2604.24300	ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning Paper ID: hf_2604.24300 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-27 23:15	Success	-	View
exp_self.20260427231151.236_20260427_231151 Paper: self.20260427231151.236	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427231151.236 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 23:12	Success	-	View
exp_self.20260427230423.235_20260427_230423 Paper: self.20260427230423.235	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427230423.235 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 23:05	Success	-	View
exp_self.20260427225714.234_20260427_225714 Paper: self.20260427225714.234	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427225714.234 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 22:58	Success	-	View
exp_pytrain.20260427225153.057_20260427_225154 Paper: pytrain.20260427225153.057	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 22:52	Success	-	View
exp_self.20260427224938.233_20260427_224938 Paper: self.20260427224938.233	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427224938.233 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 22:50	Success	-	View
exp_hf_2604.23099_20260427_224605 Paper: hf_2604.23099	ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation Paper ID: hf_2604.23099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-27 22:47	Success	-	View
exp_self.20260427223933.232_20260427_223934 Paper: self.20260427223933.232	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427223933.232 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 22:40	Success	-	View
exp_self.20260427223222.231_20260427_223223 Paper: self.20260427223222.231	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427223222.231 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 22:33	Success	-	View
exp_hf_2604.24003_20260427_222907 Paper: hf_2604.24003	Stabilizing Efficient Reasoning with Step-Level Advantage Selection Paper ID: hf_2604.24003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-27 22:30	Success	-	View
exp_self.20260427222236.230_20260427_222236 Paper: self.20260427222236.230	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427222236.230 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 22:23	Success	-	View
exp_pytrain.20260427221947.056_20260427_221948 Paper: pytrain.20260427221947.056	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 22:20	Success	-	View
exp_self.20260427221458.229_20260427_221459 Paper: self.20260427221458.229	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427221458.229 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 22:16	Success	-	View
exp_self.20260427220750.228_20260427_220750 Paper: self.20260427220750.228	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427220750.228 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 22:08	Success	-	View
exp_2604.24645v1_20260427_220414 Paper: 2604.24645v1	K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality i... Paper ID: 2604.24645v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-27 22:05	Success	-	View
exp_self.20260427220039.227_20260427_220039 Paper: self.20260427220039.227	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427220039.227 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 22:01	Success	-	View
exp_self.20260427215320.226_20260427_215320 Paper: self.20260427215320.226	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427215320.226 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 21:54	Success	-	View
exp_2604.24647v1_20260427_215021 Paper: 2604.24647v1	DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference Paper ID: 2604.24647v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-27 21:51	Success	-	View
exp_pytrain.20260427214806.055_20260427_214806 Paper: pytrain.20260427214806.055	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 21:49	Success	-	View
exp_self.20260427214558.225_20260427_214558 Paper: self.20260427214558.225	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427214558.225 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 21:47	Success	-	View
exp_self.20260427213912.224_20260427_213913 Paper: self.20260427213912.224	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427213912.224 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 21:40	Success	-	View
exp_self.20260427213206.223_20260427_213206 Paper: self.20260427213206.223	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427213206.223 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 21:33	Success	-	View
exp_self.20260427212456.222_20260427_212457 Paper: self.20260427212456.222	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427212456.222 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 21:25	Success	-	View
exp_self.20260427211731.221_20260427_211732 Paper: self.20260427211731.221	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427211731.221 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 21:18	Success	-	View
exp_pytrain.20260427211438.054_20260427_211438 Paper: pytrain.20260427211438.054	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 21:15	Success	-	View
exp_self.20260427210818.220_20260427_210818 Paper: self.20260427210818.220	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427210818.220 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 21:09	Success	-	View
exp_self.20260427210114.219_20260427_210114 Paper: self.20260427210114.219	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427210114.219 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 21:02	Success	-	View
exp_self.20260427205406.218_20260427_205407 Paper: self.20260427205406.218	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427205406.218 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 20:55	Success	-	View
exp_self.20260427204706.217_20260427_204706 Paper: self.20260427204706.217	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427204706.217 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 20:48	Success	-	View
exp_pytrain.20260427204257.053_20260427_204257 Paper: pytrain.20260427204257.053	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 20:43	Success	-	View
exp_self.20260427203943.216_20260427_203944 Paper: self.20260427203943.216	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427203943.216 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 20:40	Success	-	View
exp_self.20260427203224.215_20260427_203224 Paper: self.20260427203224.215	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427203224.215 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 20:33	Success	-	View
exp_self.20260427202501.214_20260427_202502 Paper: self.20260427202501.214	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427202501.214 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 20:26	Success	-	View
exp_self.20260427201618.213_20260427_201619 Paper: self.20260427201618.213	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427201618.213 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 20:17	Success	-	View
exp_pytrain.20260427201112.052_20260427_201113 Paper: pytrain.20260427201112.052	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 20:12	Success	-	View
exp_self.20260427200902.212_20260427_200903 Paper: self.20260427200902.212	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427200902.212 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 20:10	Success	-	View
exp_self.20260427200144.211_20260427_200145 Paper: self.20260427200144.211	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427200144.211 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 20:02	Success	-	View
exp_self.20260427195445.210_20260427_195446 Paper: self.20260427195445.210	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427195445.210 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 19:55	Success	-	View
exp_self.20260427194746.209_20260427_194746 Paper: self.20260427194746.209	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427194746.209 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 19:48	Success	-	View
exp_gh_Keeterete513_llm-model-search-recommendation_20260427_194300 Paper: gh_Keeterete513_llm-model-search-recommendation	Keeterete513/llm-model-search-recommendation Paper ID: gh_Keeterete513_llm-model-search-recommendation - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Exp...	04-27 19:44	Success	-	View
exp_self.20260427194043.208_20260427_194044 Paper: self.20260427194043.208	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427194043.208 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 19:41	Success	-	View
exp_pytrain.20260427193752.051_20260427_193752 Paper: pytrain.20260427193752.051	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 19:38	Success	-	View
exp_self.20260427193312.207_20260427_193313 Paper: self.20260427193312.207	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427193312.207 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 19:34	Success	-	View
exp_self.20260427192545.206_20260427_192546 Paper: self.20260427192545.206	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427192545.206 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 19:26	Success	-	View
exp_self.20260427191853.205_20260427_191853 Paper: self.20260427191853.205	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427191853.205 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 19:19	Success	-	View
exp_self.20260427191138.204_20260427_191139 Paper: self.20260427191138.204	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427191138.204 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 19:12	Success	-	View
exp_pytrain.20260427190618.050_20260427_190618 Paper: pytrain.20260427190618.050	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 19:07	Success	-	View
exp_self.20260427190406.203_20260427_190406 Paper: self.20260427190406.203	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427190406.203 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 19:05	Success	-	View
exp_self.20260427185707.202_20260427_185707 Paper: self.20260427185707.202	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427185707.202 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 18:58	Success	-	View
exp_self.20260427185002.201_20260427_185003 Paper: self.20260427185002.201	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427185002.201 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 18:51	Success	-	View
exp_self.20260427184254.200_20260427_184255 Paper: self.20260427184254.200	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427184254.200 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 18:43	Success	-	View
exp_self.20260427183558.199_20260427_183559 Paper: self.20260427183558.199	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427183558.199 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 18:37	Success	-	View
exp_pytrain.20260427183321.049_20260427_183321 Paper: pytrain.20260427183321.049	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 18:34	Success	-	View
exp_self.20260427182538.198_20260427_182539 Paper: self.20260427182538.198	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427182538.198 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 18:26	Success	-	View
exp_self.20260427181835.197_20260427_181835 Paper: self.20260427181835.197	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427181835.197 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 18:19	Success	-	View
exp_self.20260427181120.196_20260427_181121 Paper: self.20260427181120.196	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427181120.196 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 18:12	Success	-	View
exp_self.20260427180424.195_20260427_180424 Paper: self.20260427180424.195	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427180424.195 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 18:05	Success	-	View
exp_pytrain.20260427180148.048_20260427_180148 Paper: pytrain.20260427180148.048	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 18:02	Success	-	View
exp_self.20260427175525.194_20260427_175525 Paper: self.20260427175525.194	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427175525.194 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 17:56	Success	-	View
exp_self.20260427174723.193_20260427_174724 Paper: self.20260427174723.193	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427174723.193 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 17:48	Success	-	View
exp_self.20260427174020.192_20260427_174020 Paper: self.20260427174020.192	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427174020.192 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 17:41	Success	-	View
exp_self.20260427173302.191_20260427_173303 Paper: self.20260427173302.191	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427173302.191 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 17:34	Success	-	View
exp_pytrain.20260427173011.047_20260427_173012 Paper: pytrain.20260427173011.047	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 17:31	Success	-	View
exp_self.20260427172355.190_20260427_172355 Paper: self.20260427172355.190	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427172355.190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 17:24	Success	-	View
exp_self.20260427171608.189_20260427_171608 Paper: self.20260427171608.189	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427171608.189 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 17:17	Success	-	View
exp_self.20260427170812.188_20260427_170812 Paper: self.20260427170812.188	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427170812.188 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 17:09	Success	-	View
exp_self.20260427165949.187_20260427_165950 Paper: self.20260427165949.187	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427165949.187 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 17:00	Success	-	View
exp_pytrain.20260427165721.046_20260427_165721 Paper: pytrain.20260427165721.046	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 16:58	Success	-	View
exp_self.20260427165033.186_20260427_165034 Paper: self.20260427165033.186	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427165033.186 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 16:51	Success	-	View
exp_self.20260427164333.185_20260427_164334 Paper: self.20260427164333.185	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427164333.185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 16:44	Success	-	View
exp_self.20260427163647.184_20260427_163648 Paper: self.20260427163647.184	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427163647.184 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 16:37	Success	-	View
exp_self.20260427162943.183_20260427_162944 Paper: self.20260427162943.183	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427162943.183 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 16:30	Success	-	View
exp_pytrain.20260427162433.045_20260427_162434 Paper: pytrain.20260427162433.045	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 16:25	Success	-	View
exp_self.20260427162232.182_20260427_162233 Paper: self.20260427162232.182	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427162232.182 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 16:23	Success	-	View
exp_self.20260427161518.181_20260427_161518 Paper: self.20260427161518.181	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427161518.181 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 16:16	Success	-	View
exp_self.20260427160833.180_20260427_160834 Paper: self.20260427160833.180	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427160833.180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 16:09	Success	-	View
exp_self.20260427160146.179_20260427_160147 Paper: self.20260427160146.179	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427160146.179 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 16:02	Success	-	View
exp_self.20260427155456.178_20260427_155456 Paper: self.20260427155456.178	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427155456.178 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 15:55	Success	-	View
exp_pytrain.20260427155204.044_20260427_155204 Paper: pytrain.20260427155204.044	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 15:53	Success	-	View
exp_self.20260427154548.177_20260427_154549 Paper: self.20260427154548.177	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427154548.177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 15:46	Success	-	View
exp_self.20260427153845.176_20260427_153845 Paper: self.20260427153845.176	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427153845.176 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 15:39	Success	-	View
exp_self.20260427153144.175_20260427_153145 Paper: self.20260427153144.175	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427153144.175 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 15:32	Success	-	View
exp_self.20260427152450.174_20260427_152451 Paper: self.20260427152450.174	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427152450.174 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 15:25	Success	-	View
exp_pytrain.20260427151938.043_20260427_151939 Paper: pytrain.20260427151938.043	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 15:20	Success	-	View
exp_self.20260427151737.173_20260427_151737 Paper: self.20260427151737.173	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427151737.173 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 15:18	Success	-	View
exp_self.20260427151039.172_20260427_151039 Paper: self.20260427151039.172	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427151039.172 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 15:11	Success	-	View
exp_self.20260427150354.171_20260427_150354 Paper: self.20260427150354.171	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427150354.171 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 15:04	Success	-	View
exp_self.20260427145704.170_20260427_145705 Paper: self.20260427145704.170	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427145704.170 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 14:58	Success	-	View
exp_self.20260427145002.169_20260427_145002 Paper: self.20260427145002.169	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427145002.169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 14:51	Success	-	View
exp_pytrain.20260427144717.042_20260427_144717 Paper: pytrain.20260427144717.042	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 14:48	Success	-	View
exp_self.20260427144241.168_20260427_144242 Paper: self.20260427144241.168	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427144241.168 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 14:43	Success	-	View
exp_self.20260427143525.167_20260427_143525 Paper: self.20260427143525.167	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427143525.167 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 14:36	Success	-	View
exp_self.20260427142825.166_20260427_142825 Paper: self.20260427142825.166	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427142825.166 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 14:29	Success	-	View
exp_self.20260427142118.165_20260427_142119 Paper: self.20260427142118.165	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427142118.165 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 14:22	Success	-	View
exp_pytrain.20260427141543.041_20260427_141543 Paper: pytrain.20260427141543.041	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 14:16	Success	-	View
exp_self.20260427141333.164_20260427_141334 Paper: self.20260427141333.164	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427141333.164 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 14:14	Success	-	View
exp_self.20260427140622.163_20260427_140622 Paper: self.20260427140622.163	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427140622.163 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 14:07	Success	-	View
exp_self.20260427135918.162_20260427_135918 Paper: self.20260427135918.162	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427135918.162 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 14:00	Success	-	View
exp_self.20260427135206.161_20260427_135217 Paper: self.20260427135206.161	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427135206.161 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 13:53	Success	-	View
exp_self.20260427134518.160_20260427_134519 Paper: self.20260427134518.160	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427134518.160 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 13:46	Success	-	View
exp_pytrain.20260427134233.040_20260427_134233 Paper: pytrain.20260427134233.040	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 13:43	Success	-	View
exp_self.20260427133627.159_20260427_133627 Paper: self.20260427133627.159	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427133627.159 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 13:37	Success	-	View
exp_self.20260427132926.158_20260427_132926 Paper: self.20260427132926.158	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427132926.158 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 13:30	Success	-	View
exp_self.20260427132234.157_20260427_132234 Paper: self.20260427132234.157	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427132234.157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 13:23	Success	-	View
exp_self.20260427131544.156_20260427_131545 Paper: self.20260427131544.156	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427131544.156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 13:16	Success	-	View
exp_pytrain.20260427131032.039_20260427_131032 Paper: pytrain.20260427131032.039	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 13:11	Success	-	View
exp_self.20260427130821.155_20260427_130821 Paper: self.20260427130821.155	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427130821.155 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 13:09	Success	-	View
exp_self.20260427130111.154_20260427_130112 Paper: self.20260427130111.154	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427130111.154 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 13:02	Success	-	View
exp_self.20260427125401.153_20260427_125401 Paper: self.20260427125401.153	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427125401.153 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 12:55	Success	-	View
exp_self.20260427124714.152_20260427_124715 Paper: self.20260427124714.152	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427124714.152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 12:48	Success	-	View
exp_self.20260427124030.151_20260427_124030 Paper: self.20260427124030.151	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427124030.151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 12:41	Success	-	View
exp_pytrain.20260427123742.038_20260427_123751 Paper: pytrain.20260427123742.038	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 12:38	Success	-	View
exp_self.20260427121526.150_20260427_121527 Paper: self.20260427121526.150	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427121526.150 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 12:16	Success	-	View
exp_self.20260427120753.149_20260427_120754 Paper: self.20260427120753.149	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427120753.149 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 12:08	Success	-	View
exp_self.20260427120023.148_20260427_120023 Paper: self.20260427120023.148	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427120023.148 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 12:01	Success	-	View
exp_pytrain.20260427115748.037_20260427_115748 Paper: pytrain.20260427115748.037	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 11:58	Success	-	View
exp_self.20260427115057.147_20260427_115057 Paper: self.20260427115057.147	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427115057.147 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 11:51	Success	-	View
exp_self.20260427114321.146_20260427_114322 Paper: self.20260427114321.146	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427114321.146 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 11:44	Success	-	View
exp_self.20260427113546.145_20260427_113546 Paper: self.20260427113546.145	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427113546.145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 11:36	Success	-	View
exp_self.20260427112818.144_20260427_112818 Paper: self.20260427112818.144	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427112818.144 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 11:29	Success	-	View
exp_pytrain.20260427112545.036_20260427_112546 Paper: pytrain.20260427112545.036	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 11:26	Success	-	View
exp_self.20260427111853.143_20260427_111853 Paper: self.20260427111853.143	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427111853.143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 11:19	Success	-	View
exp_self.20260427111114.142_20260427_111115 Paper: self.20260427111114.142	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427111114.142 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 11:12	Success	-	View
exp_self.20260427110339.141_20260427_110339 Paper: self.20260427110339.141	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427110339.141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 11:04	Success	-	View
exp_self.20260427105603.140_20260427_105604 Paper: self.20260427105603.140	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427105603.140 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 10:57	Success	-	View
exp_pytrain.20260427105336.035_20260427_105336 Paper: pytrain.20260427105336.035	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 10:54	Success	-	View
exp_self.20260427104635.139_20260427_104636 Paper: self.20260427104635.139	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427104635.139 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 10:47	Success	-	View
exp_self.20260427103858.138_20260427_103858 Paper: self.20260427103858.138	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427103858.138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 10:40	Success	-	View
exp_self.20260427103110.137_20260427_103111 Paper: self.20260427103110.137	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427103110.137 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 10:32	Success	-	View
exp_self.20260427102330.136_20260427_102330 Paper: self.20260427102330.136	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427102330.136 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 10:24	Success	-	View
exp_pytrain.20260427102102.034_20260427_102102 Paper: pytrain.20260427102102.034	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 10:22	Success	-	View
exp_self.20260427101355.135_20260427_101356 Paper: self.20260427101355.135	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427101355.135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 10:14	Success	-	View
exp_self.20260427100617.134_20260427_100617 Paper: self.20260427100617.134	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427100617.134 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 10:07	Success	-	View
exp_self.20260427095836.133_20260427_095836 Paper: self.20260427095836.133	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427095836.133 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 09:59	Success	-	View
exp_self.20260427095059.132_20260427_095100 Paper: self.20260427095059.132	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427095059.132 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 09:52	Success	-	View
exp_pytrain.20260427094833.033_20260427_094834 Paper: pytrain.20260427094833.033	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 09:49	Success	-	View
exp_self.20260427094123.131_20260427_094124 Paper: self.20260427094123.131	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427094123.131 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 09:42	Success	-	View
exp_self.20260427093349.130_20260427_093349 Paper: self.20260427093349.130	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427093349.130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 09:34	Success	-	View
exp_self.20260427092615.129_20260427_092615 Paper: self.20260427092615.129	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427092615.129 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 09:27	Success	-	View
exp_self.20260427091835.128_20260427_091835 Paper: self.20260427091835.128	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427091835.128 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 09:19	Success	-	View
exp_pytrain.20260427091608.032_20260427_091608 Paper: pytrain.20260427091608.032	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 09:17	Success	-	View
exp_self.20260427090902.127_20260427_090902 Paper: self.20260427090902.127	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427090902.127 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 09:10	Success	-	View
exp_self.20260427090133.126_20260427_090133 Paper: self.20260427090133.126	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427090133.126 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 09:02	Success	-	View
exp_self.20260427085404.125_20260427_085404 Paper: self.20260427085404.125	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427085404.125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 08:55	Success	-	View
exp_self.20260427084627.124_20260427_084627 Paper: self.20260427084627.124	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427084627.124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 08:47	Success	-	View
exp_pytrain.20260427084400.031_20260427_084401 Paper: pytrain.20260427084400.031	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 08:45	Success	-	View
exp_self.20260427083940.123_20260427_083941 Paper: self.20260427083940.123	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427083940.123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 08:40	Success	-	View
exp_self.20260427083159.122_20260427_083200 Paper: self.20260427083159.122	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427083159.122 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 08:33	Success	-	View
exp_self.20260427082419.121_20260427_082420 Paper: self.20260427082419.121	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427082419.121 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 08:25	Success	-	View
exp_self.20260427081633.120_20260427_081635 Paper: self.20260427081633.120	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427081633.120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 08:17	Success	-	View
exp_pytrain.20260427081242.030_20260427_081243 Paper: pytrain.20260427081242.030	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 08:13	Success	-	View
exp_self.20260427080931.119_20260427_080931 Paper: self.20260427080931.119	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427080931.119 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 08:10	Success	-	View
exp_hf_2604.22085_20260427_080627 Paper: hf_2604.22085	Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents Paper ID: hf_2604.22085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-27 08:07	Success	-	View
exp_self.20260427075841.118_20260427_075842 Paper: self.20260427075841.118	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427075841.118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 08:01	Success	-	View
exp_self.20260427075026.117_20260427_075027 Paper: self.20260427075026.117	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427075026.117 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 07:51	Success	-	View
exp_self.20260427074258.116_20260427_074258 Paper: self.20260427074258.116	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427074258.116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 07:44	Success	-	View
exp_pytrain.20260427074035.029_20260427_074036 Paper: pytrain.20260427074035.029	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 07:41	Success	-	View
exp_self.20260427073337.115_20260427_073337 Paper: self.20260427073337.115	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427073337.115 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 07:34	Success	-	View
exp_self.20260427072608.114_20260427_072608 Paper: self.20260427072608.114	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427072608.114 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 07:27	Success	-	View
exp_self.20260427071826.113_20260427_071827 Paper: self.20260427071826.113	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427071826.113 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 07:19	Success	-	View
exp_self.20260427071041.112_20260427_071041 Paper: self.20260427071041.112	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427071041.112 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 07:11	Success	-	View
exp_pytrain.20260427070820.028_20260427_070820 Paper: pytrain.20260427070820.028	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 07:09	Success	-	View
exp_self.20260427070122.111_20260427_070122 Paper: self.20260427070122.111	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427070122.111 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 07:02	Success	-	View
exp_self.20260427065356.110_20260427_065357 Paper: self.20260427065356.110	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427065356.110 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 06:54	Success	-	View
exp_self.20260427064626.109_20260427_064626 Paper: self.20260427064626.109	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427064626.109 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 06:47	Success	-	View
exp_self.20260427063900.108_20260427_063901 Paper: self.20260427063900.108	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427063900.108 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 06:40	Success	-	View
exp_pytrain.20260427063639.027_20260427_063640 Paper: pytrain.20260427063639.027	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 06:37	Success	-	View
exp_self.20260427062939.107_20260427_062939 Paper: self.20260427062939.107	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427062939.107 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 06:30	Success	-	View
exp_self.20260427062210.106_20260427_062210 Paper: self.20260427062210.106	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427062210.106 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 06:23	Success	-	View
exp_self.20260427061445.105_20260427_061446 Paper: self.20260427061445.105	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427061445.105 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 06:15	Success	-	View
exp_self.20260427060749.104_20260427_060750 Paper: self.20260427060749.104	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427060749.104 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 06:08	Success	-	View
exp_pytrain.20260427060520.026_20260427_060521 Paper: pytrain.20260427060520.026	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 06:06	Success	-	View
exp_self.20260427055835.103_20260427_055836 Paper: self.20260427055835.103	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427055835.103 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 05:59	Success	-	View
exp_self.20260427055105.102_20260427_055105 Paper: self.20260427055105.102	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427055105.102 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 05:52	Success	-	View
exp_self.20260427054340.101_20260427_054340 Paper: self.20260427054340.101	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427054340.101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 05:44	Success	-	View
exp_self.20260427053618.100_20260427_053619 Paper: self.20260427053618.100	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427053618.100 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 05:37	Success	-	View
exp_pytrain.20260427053352.025_20260427_053353 Paper: pytrain.20260427053352.025	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 05:34	Success	-	View
exp_self.20260427052802.099_20260427_052802 Paper: self.20260427052802.099	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427052802.099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 05:29	Success	-	View
exp_self.20260427052040.098_20260427_052041 Paper: self.20260427052040.098	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427052040.098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 05:21	Success	-	View
exp_self.20260427051316.097_20260427_051316 Paper: self.20260427051316.097	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427051316.097 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 05:14	Success	-	View
exp_self.20260427050545.096_20260427_050545 Paper: self.20260427050545.096	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427050545.096 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 05:06	Success	-	View
exp_pytrain.20260427050214.024_20260427_050215 Paper: pytrain.20260427050214.024	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 05:03	Success	-	View
exp_self.20260427045903.095_20260427_045903 Paper: self.20260427045903.095	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427045903.095 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 05:00	Success	-	View
exp_self.20260427045140.094_20260427_045141 Paper: self.20260427045140.094	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427045140.094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 04:52	Success	-	View
exp_self.20260427044419.093_20260427_044419 Paper: self.20260427044419.093	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427044419.093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 04:45	Success	-	View
exp_self.20260427043655.092_20260427_043655 Paper: self.20260427043655.092	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427043655.092 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 04:37	Success	-	View
exp_pytrain.20260427043041.023_20260427_043041 Paper: pytrain.20260427043041.023	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 04:31	Success	-	View
exp_self.20260427042847.091_20260427_042848 Paper: self.20260427042847.091	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427042847.091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 04:29	Success	-	View
exp_hf_2604.21718_20260427_042315 Paper: hf_2604.21718	Building a Precise Video Language with Human-AI Oversight Paper ID: hf_2604.21718 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-27 04:24	Success	-	View
exp_self.20260427042117.090_20260427_042118 Paper: self.20260427042117.090	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427042117.090 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 04:22	Success	-	View
exp_self.20260427041353.089_20260427_041354 Paper: self.20260427041353.089	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427041353.089 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 04:14	Success	-	View
exp_cr_10.1007_s11831-026-10598-4_20260427_040933 Paper: cr_10.1007_s11831-026-10598-4	Building Expert Small Models: A Comprehensive Survey of Model Compression, Knowledge Distillation, and Augmented Inferen... Paper ID: cr_10.1007_s11831-026-10598-4 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-27 04:10	Success	-	View
exp_self.20260427040714.088_20260427_040714 Paper: self.20260427040714.088	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427040714.088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 04:08	Success	-	View
exp_self.20260427035947.087_20260427_035947 Paper: self.20260427035947.087	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427035947.087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 04:00	Success	-	View
exp_pytrain.20260427035726.022_20260427_035727 Paper: pytrain.20260427035726.022	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 03:58	Success	-	View
exp_self.20260427035034.086_20260427_035035 Paper: self.20260427035034.086	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427035034.086 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 03:51	Success	-	View
exp_self.20260427034310.085_20260427_034311 Paper: self.20260427034310.085	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427034310.085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 03:44	Success	-	View
exp_self.20260427033544.084_20260427_033544 Paper: self.20260427033544.084	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427033544.084 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 03:36	Success	-	View
exp_self.20260427032822.083_20260427_032822 Paper: self.20260427032822.083	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427032822.083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 03:29	Success	-	View
exp_pytrain.20260427032601.021_20260427_032601 Paper: pytrain.20260427032601.021	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 03:27	Success	-	View
exp_self.20260427031905.082_20260427_031905 Paper: self.20260427031905.082	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427031905.082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 03:20	Success	-	View
exp_self.20260427031144.081_20260427_031144 Paper: self.20260427031144.081	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427031144.081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 03:12	Success	-	View
exp_self.20260427030425.080_20260427_030425 Paper: self.20260427030425.080	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427030425.080 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 03:05	Success	-	View
exp_self.20260427025659.079_20260427_025700 Paper: self.20260427025659.079	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427025659.079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 02:58	Success	-	View
exp_pytrain.20260427025436.020_20260427_025437 Paper: pytrain.20260427025436.020	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 02:55	Success	-	View
exp_self.20260427024739.078_20260427_024739 Paper: self.20260427024739.078	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427024739.078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 02:48	Success	-	View
exp_self.20260427024021.077_20260427_024021 Paper: self.20260427024021.077	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427024021.077 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 02:41	Success	-	View
exp_cr_10.1007_s40864-026-00269-9_20260427_023706 Paper: cr_10.1007_s40864-026-00269-9	Train Slide Prediction and Risk Assessment Using Vehicle-Signal Data: A Data-Model Fusion Method Paper ID: cr_10.1007_s40864-026-00269-9 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-27 02:38	Success	-	View
exp_self.20260427023250.076_20260427_023251 Paper: self.20260427023250.076	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427023250.076 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 02:33	Success	-	View
exp_self.20260427022533.075_20260427_022534 Paper: self.20260427022533.075	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427022533.075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 02:26	Success	-	View
exp_pytrain.20260427022306.019_20260427_022306 Paper: pytrain.20260427022306.019	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 02:24	Success	-	View
exp_self.20260427021620.074_20260427_021620 Paper: self.20260427021620.074	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427021620.074 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 02:17	Success	-	View
exp_self.20260427020850.073_20260427_020851 Paper: self.20260427020850.073	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427020850.073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 02:09	Success	-	View
exp_self.20260427020127.072_20260427_020127 Paper: self.20260427020127.072	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427020127.072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 02:02	Success	-	View
exp_self.20260427015407.071_20260427_015407 Paper: self.20260427015407.071	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427015407.071 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 01:55	Success	-	View
exp_pytrain.20260427015145.018_20260427_015146 Paper: pytrain.20260427015145.018	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 01:52	Success	-	View
exp_self.20260427014457.070_20260427_014458 Paper: self.20260427014457.070	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427014457.070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 01:46	Success	-	View
exp_self.20260427013738.069_20260427_013738 Paper: self.20260427013738.069	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427013738.069 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 01:38	Success	-	View
exp_self.20260427013012.068_20260427_013013 Paper: self.20260427013012.068	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427013012.068 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 01:31	Success	-	View
exp_self.20260427012251.067_20260427_012251 Paper: self.20260427012251.067	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427012251.067 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 01:23	Success	-	View
exp_pytrain.20260427011919.017_20260427_011919 Paper: pytrain.20260427011919.017	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 01:20	Success	-	View
exp_self.20260427011511.066_20260427_011512 Paper: self.20260427011511.066	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427011511.066 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 01:16	Success	-	View
exp_self.20260427010748.065_20260427_010749 Paper: self.20260427010748.065	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427010748.065 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 01:08	Success	-	View
exp_self.20260427010023.064_20260427_010023 Paper: self.20260427010023.064	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427010023.064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 01:01	Success	-	View
exp_self.20260427005303.063_20260427_005304 Paper: self.20260427005303.063	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427005303.063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 00:54	Success	-	View
exp_pytrain.20260427004721.016_20260427_004722 Paper: pytrain.20260427004721.016	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 00:48	Success	-	View
exp_self.20260427004529.062_20260427_004529 Paper: self.20260427004529.062	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427004529.062 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 00:46	Success	-	View
exp_self.20260427003811.061_20260427_003811 Paper: self.20260427003811.061	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427003811.061 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 00:39	Success	-	View
exp_self.20260427003056.060_20260427_003056 Paper: self.20260427003056.060	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427003056.060 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 00:31	Success	-	View
exp_self.20260427002338.059_20260427_002339 Paper: self.20260427002338.059	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427002338.059 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 00:24	Success	-	View
exp_self.20260427001617.058_20260427_001617 Paper: self.20260427001617.058	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427001617.058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 00:17	Success	-	View
exp_pytrain.20260427001353.015_20260427_001353 Paper: pytrain.20260427001353.015	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-27 00:14	Success	-	View
exp_self.20260427000659.057_20260427_000700 Paper: self.20260427000659.057	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260427000659.057 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 00:08	Success	-	View
exp_cr_10.3897_jucs.160588_20260427_000351 Paper: cr_10.3897_jucs.160588	Duygu-Turk: A Context-Aware Sentiment Analysis Framework for Turkish, Based on Plutchik&rsquo;s Emotion Model Paper ID: cr_10.3897_jucs.160588 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	04-27 00:04	Success	-	View
exp_self.20260426235935.056_20260426_235935 Paper: self.20260426235935.056	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426235935.056 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-27 00:00	Success	-	View
exp_self.20260426235215.055_20260426_235215 Paper: self.20260426235215.055	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426235215.055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 23:53	Success	-	View
exp_hf_2604.22294_20260426_234757 Paper: hf_2604.22294	Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets Paper ID: hf_2604.22294 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-26 23:48	Success	-	View
exp_self.20260426234451.054_20260426_234452 Paper: self.20260426234451.054	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426234451.054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 23:45	Success	-	View
exp_pytrain.20260426234228.014_20260426_234229 Paper: pytrain.20260426234228.014	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 23:43	Success	-	View
exp_self.20260426233535.053_20260426_233536 Paper: self.20260426233535.053	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426233535.053 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 23:36	Success	-	View
exp_self.20260426232816.052_20260426_232817 Paper: self.20260426232816.052	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426232816.052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 23:29	Success	-	View
exp_self.20260426232057.051_20260426_232057 Paper: self.20260426232057.051	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426232057.051 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 23:21	Success	-	View
exp_hf_2604.18580_20260426_231743 Paper: hf_2604.18580	Sessa: Selective State Space Attention Paper ID: hf_2604.18580 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-26 23:18	Success	-	View
exp_self.20260426231330.050_20260426_231330 Paper: self.20260426231330.050	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426231330.050 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 23:14	Success	-	View
exp_pytrain.20260426231109.013_20260426_231109 Paper: pytrain.20260426231109.013	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 23:12	Success	-	View
exp_self.20260426230421.049_20260426_230421 Paper: self.20260426230421.049	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426230421.049 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 23:05	Success	-	View
exp_hf_2604.22586_20260426_230106 Paper: hf_2604.22586	FlowAnchor: Stabilizing the Editing Signal for Inversion-Free Video Editing Paper ID: hf_2604.22586 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-26 23:02	Success	-	View
exp_self.20260426225652.048_20260426_225652 Paper: self.20260426225652.048	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426225652.048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 22:57	Success	-	View
exp_self.20260426224931.047_20260426_224931 Paper: self.20260426224931.047	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426224931.047 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 22:50	Success	-	View
exp_self.20260426224210.046_20260426_224211 Paper: self.20260426224210.046	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426224210.046 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 22:43	Success	-	View
exp_pytrain.20260426223943.012_20260426_223943 Paper: pytrain.20260426223943.012	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 22:40	Success	-	View
exp_self.20260426223531.045_20260426_223532 Paper: self.20260426223531.045	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426223531.045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 22:36	Success	-	View
exp_self.20260426222812.044_20260426_222813 Paper: self.20260426222812.044	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426222812.044 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 22:29	Success	-	View
exp_self.20260426222046.043_20260426_222046 Paper: self.20260426222046.043	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426222046.043 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 22:21	Success	-	View
exp_hf_2604.16353_20260426_221752 Paper: hf_2604.16353	AgriIR: A Scalable Framework for Domain-Specific Knowledge Retrieval Paper ID: hf_2604.16353 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-26 22:18	Success	-	View
exp_self.20260426221045.042_20260426_221045 Paper: self.20260426221045.042	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426221045.042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 22:11	Success	-	View
exp_pytrain.20260426220815.011_20260426_220816 Paper: pytrain.20260426220815.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 22:09	Success	-	View
exp_self.20260426220117.041_20260426_220117 Paper: self.20260426220117.041	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426220117.041 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 22:02	Success	-	View
exp_self.20260426215354.040_20260426_215354 Paper: self.20260426215354.040	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426215354.040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 21:54	Success	-	View
exp_self.20260426214621.039_20260426_214622 Paper: self.20260426214621.039	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426214621.039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 21:47	Success	-	View
exp_self.20260426213848.038_20260426_213848 Paper: self.20260426213848.038	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426213848.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 21:39	Success	-	View
exp_pytrain.20260426213619.010_20260426_213619 Paper: pytrain.20260426213619.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 21:37	Success	-	View
exp_self.20260426212919.037_20260426_212919 Paper: self.20260426212919.037	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426212919.037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 21:30	Success	-	View
exp_self.20260426212143.036_20260426_212143 Paper: self.20260426212143.036	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426212143.036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 21:22	Success	-	View
exp_hf_2604.08645_20260426_211825 Paper: hf_2604.08645	3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding Paper ID: hf_2604.08645 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-26 21:19	Success	-	View
exp_self.20260426211402.035_20260426_211402 Paper: self.20260426211402.035	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426211402.035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 21:15	Success	-	View
exp_self.20260426210634.034_20260426_210635 Paper: self.20260426210634.034	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426210634.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 21:07	Success	-	View
exp_pytrain.20260426210402.009_20260426_210402 Paper: pytrain.20260426210402.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 21:05	Success	-	View
exp_hf_2604.18519_20260426_210116 Paper: hf_2604.18519	LLM Safety From Within: Detecting Harmful Content with Internal Representations Paper ID: hf_2604.18519 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-26 21:02	Success	-	View
exp_self.20260426205759.033_20260426_205800 Paper: self.20260426205759.033	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426205759.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 20:59	Success	-	View
exp_2604.22750v1_20260426_205443 Paper: 2604.22750v1	How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks Paper ID: 2604.22750v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-26 20:55	Success	-	View
exp_hf_2604.22152_20260426_205113 Paper: hf_2604.22152	dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model Paper ID: hf_2604.22152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-26 20:52	Success	-	View
exp_self.20260426204912.032_20260426_204912 Paper: self.20260426204912.032	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426204912.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 20:50	Success	-	View
exp_self.20260426204139.031_20260426_204140 Paper: self.20260426204139.031	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426204139.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 20:42	Success	-	View
exp_self.20260426203359.030_20260426_203359 Paper: self.20260426203359.030	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426203359.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 20:35	Success	-	View
exp_pytrain.20260426203131.008_20260426_203131 Paper: pytrain.20260426203131.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 20:32	Success	-	View
exp_self.20260426202432.029_20260426_202433 Paper: self.20260426202432.029	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426202432.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 20:25	Success	-	View
exp_self.20260426201702.028_20260426_201702 Paper: self.20260426201702.028	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426201702.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 20:18	Success	-	View
exp_self.20260426200935.027_20260426_200936 Paper: self.20260426200935.027	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426200935.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 20:10	Success	-	View
exp_self.20260426200206.026_20260426_200207 Paper: self.20260426200206.026	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426200206.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 20:03	Success	-	View
exp_pytrain.20260426195935.007_20260426_195936 Paper: pytrain.20260426195935.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 20:00	Success	-	View
exp_self.20260426195228.025_20260426_195228 Paper: self.20260426195228.025	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426195228.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 19:53	Success	-	View
exp_self.20260426194503.024_20260426_194503 Paper: self.20260426194503.024	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426194503.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 19:46	Success	-	View
exp_self.20260426193735.023_20260426_193736 Paper: self.20260426193735.023	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426193735.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 19:38	Success	-	View
exp_self.20260426193006.022_20260426_193007 Paper: self.20260426193006.022	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426193006.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 19:31	Success	-	View
exp_pytrain.20260426192736.006_20260426_192737 Paper: pytrain.20260426192736.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 19:28	Success	-	View
exp_self.20260426192035.021_20260426_192035 Paper: self.20260426192035.021	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426192035.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 19:21	Success	-	View
exp_self.20260426191310.020_20260426_191310 Paper: self.20260426191310.020	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426191310.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 19:14	Success	-	View
exp_self.20260426190537.019_20260426_190538 Paper: self.20260426190537.019	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426190537.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 19:06	Success	-	View
exp_self.20260426185802.018_20260426_185802 Paper: self.20260426185802.018	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426185802.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 18:59	Success	-	View
exp_pytrain.20260426185529.005_20260426_185529 Paper: pytrain.20260426185529.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 18:56	Success	-	View
exp_self.20260426184822.017_20260426_184822 Paper: self.20260426184822.017	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426184822.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 18:49	Success	-	View
exp_self.20260426184051.016_20260426_184052 Paper: self.20260426184051.016	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426184051.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 18:41	Success	-	View
exp_self.20260426183320.015_20260426_183320 Paper: self.20260426183320.015	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426183320.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 18:34	Success	-	View
exp_self.20260426182541.014_20260426_182542 Paper: self.20260426182541.014	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426182541.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 18:26	Success	-	View
exp_pytrain.20260426182252.004_20260426_182252 Paper: pytrain.20260426182252.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 18:23	Success	-	View
exp_self.20260426181823.013_20260426_181824 Paper: self.20260426181823.013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426181823.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 18:19	Success	-	View
exp_self.20260426181030.012_20260426_181030 Paper: self.20260426181030.012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426181030.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 18:11	Success	-	View
exp_self.20260426180253.011_20260426_180254 Paper: self.20260426180253.011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426180253.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 18:04	Success	-	View
exp_self.20260426175515.010_20260426_175515 Paper: self.20260426175515.010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426175515.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 17:56	Success	-	View
exp_pytrain.20260426175125.003_20260426_175125 Paper: pytrain.20260426175125.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 17:52	Success	-	View
exp_self.20260426174800.009_20260426_174801 Paper: self.20260426174800.009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426174800.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 17:49	Success	-	View
exp_self.20260426174014.008_20260426_174014 Paper: self.20260426174014.008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426174014.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 17:41	Success	-	View
exp_self.20260426173236.007_20260426_173237 Paper: self.20260426173236.007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426173236.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 17:33	Success	-	View
exp_self.20260426172503.006_20260426_172504 Paper: self.20260426172503.006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426172503.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 17:26	Success	-	View
exp_pytrain.20260426171918.002_20260426_171918 Paper: pytrain.20260426171918.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 17:20	Success	-	View
exp_self.20260426171725.005_20260426_171725 Paper: self.20260426171725.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426171725.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 17:18	Success	-	View
exp_self.20260426171007.004_20260426_171008 Paper: self.20260426171007.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426171007.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 17:11	Success	-	View
exp_self.20260426170246.003_20260426_170246 Paper: self.20260426170246.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426170246.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 17:03	Success	-	View
exp_self.20260426165528.002_20260426_165528 Paper: self.20260426165528.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426165528.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 16:56	Success	-	View
exp_self.20260426164809.001_20260426_164810 Paper: self.20260426164809.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426164809.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 16:49	Success	-	View
exp_pytrain.20260426164548.001_20260426_164548 Paper: pytrain.20260426164548.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 16:46	Success	-	View
exp_self.20260426163845.034_20260426_163846 Paper: self.20260426163845.034	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426163845.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 16:39	Success	-	View
exp_pytrain.20260426163546.009_20260426_163547 Paper: pytrain.20260426163546.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 16:37	Success	-	View
exp_self.20260426162851.033_20260426_162851 Paper: self.20260426162851.033	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426162851.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 16:29	Success	-	View
exp_self.20260426162114.032_20260426_162115 Paper: self.20260426162114.032	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426162114.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 16:22	Success	-	View
exp_self.20260426161348.031_20260426_161348 Paper: self.20260426161348.031	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426161348.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 16:14	Success	-	View
exp_self.20260426160625.030_20260426_160626 Paper: self.20260426160625.030	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426160625.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 16:07	Success	-	View
exp_pytrain.20260426160348.008_20260426_160349 Paper: pytrain.20260426160348.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 16:04	Success	-	View
exp_self.20260426155659.029_20260426_155700 Paper: self.20260426155659.029	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426155659.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 15:58	Success	-	View
exp_self.20260426154924.028_20260426_154924 Paper: self.20260426154924.028	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426154924.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 15:50	Success	-	View
exp_self.20260426154203.027_20260426_154203 Paper: self.20260426154203.027	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426154203.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 15:43	Success	-	View
exp_self.20260426153444.026_20260426_153444 Paper: self.20260426153444.026	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426153444.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 15:35	Success	-	View
exp_pytrain.20260426153222.007_20260426_153222 Paper: pytrain.20260426153222.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 15:33	Success	-	View
exp_self.20260426152529.025_20260426_152529 Paper: self.20260426152529.025	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426152529.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 15:26	Success	-	View
exp_self.20260426151810.024_20260426_151811 Paper: self.20260426151810.024	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426151810.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 15:19	Success	-	View
exp_self.20260426151053.023_20260426_151054 Paper: self.20260426151053.023	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426151053.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 15:11	Success	-	View
exp_self.20260426150331.022_20260426_150331 Paper: self.20260426150331.022	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426150331.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 15:04	Success	-	View
exp_pytrain.20260426150105.006_20260426_150105 Paper: pytrain.20260426150105.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 15:02	Success	-	View
exp_self.20260426145407.021_20260426_145407 Paper: self.20260426145407.021	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426145407.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 14:55	Success	-	View
exp_self.20260426144649.020_20260426_144649 Paper: self.20260426144649.020	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426144649.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 14:47	Success	-	View
exp_self.20260426143929.019_20260426_143929 Paper: self.20260426143929.019	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426143929.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 14:40	Success	-	View
exp_self.20260426143209.018_20260426_143209 Paper: self.20260426143209.018	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426143209.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 14:33	Success	-	View
exp_pytrain.20260426142947.005_20260426_142947 Paper: pytrain.20260426142947.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 14:30	Success	-	View
exp_self.20260426142256.017_20260426_142256 Paper: self.20260426142256.017	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426142256.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 14:23	Success	-	View
exp_self.20260426141535.016_20260426_141536 Paper: self.20260426141535.016	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426141535.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 14:16	Success	-	View
exp_self.20260426140815.015_20260426_140815 Paper: self.20260426140815.015	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426140815.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 14:09	Success	-	View
exp_self.20260426140053.014_20260426_140054 Paper: self.20260426140053.014	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426140053.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 14:01	Success	-	View
exp_pytrain.20260426135832.004_20260426_135832 Paper: pytrain.20260426135832.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 13:59	Success	-	View
exp_self.20260426135145.013_20260426_135145 Paper: self.20260426135145.013	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426135145.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 13:52	Success	-	View
exp_self.20260426134424.012_20260426_134425 Paper: self.20260426134424.012	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426134424.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 13:45	Success	-	View
exp_self.20260426133702.011_20260426_133702 Paper: self.20260426133702.011	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426133702.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 13:38	Success	-	View
exp_self.20260426132939.010_20260426_132939 Paper: self.20260426132939.010	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426132939.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 13:30	Success	-	View
exp_pytrain.20260426132608.003_20260426_132608 Paper: pytrain.20260426132608.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 13:27	Success	-	View
exp_self.20260426132202.009_20260426_132202 Paper: self.20260426132202.009	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426132202.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 13:23	Success	-	View
exp_self.20260426131440.008_20260426_131440 Paper: self.20260426131440.008	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426131440.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 13:15	Success	-	View
exp_self.20260426130725.007_20260426_130725 Paper: self.20260426130725.007	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426130725.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 13:08	Success	-	View
exp_self.20260426130007.006_20260426_130008 Paper: self.20260426130007.006	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426130007.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 13:01	Success	-	View
exp_pytrain.20260426125423.002_20260426_125424 Paper: pytrain.20260426125423.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 12:55	Success	-	View
exp_self.20260426125230.005_20260426_125230 Paper: self.20260426125230.005	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426125230.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 12:53	Success	-	View
exp_self.20260426124512.004_20260426_124513 Paper: self.20260426124512.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426124512.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 12:46	Success	-	View
exp_self.20260426123750.003_20260426_123750 Paper: self.20260426123750.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426123750.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 12:38	Success	-	View
exp_self.20260426123029.002_20260426_123029 Paper: self.20260426123029.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426123029.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 12:31	Success	-	View
exp_self.20260426122311.001_20260426_122311 Paper: self.20260426122311.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426122311.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 12:24	Success	-	View
exp_pytrain.20260426122049.001_20260426_122049 Paper: pytrain.20260426122049.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 12:21	Success	-	View
exp_pytrain.20260426115648.002_20260426_115843 Paper: pytrain.20260426115648.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 11:59	Success	-	View
exp_self.20260426114303.004_20260426_114303 Paper: self.20260426114303.004	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426114303.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 11:44	Success	-	View
exp_self.20260426113539.003_20260426_113539 Paper: self.20260426113539.003	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426113539.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 11:36	Success	-	View
exp_self.20260426112819.002_20260426_112820 Paper: self.20260426112819.002	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426112819.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 11:29	Success	-	View
exp_self.20260426112059.001_20260426_112059 Paper: self.20260426112059.001	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426112059.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 11:22	Success	-	View
exp_pytrain.20260426111838.001_20260426_111838 Paper: pytrain.20260426111838.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 11:19	Success	-	View
exp_self.20260426111056.851_20260426_111056 Paper: self.20260426111056.851	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426111056.851 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 11:11	Success	-	View
exp_self.20260426110335.850_20260426_110336 Paper: self.20260426110335.850	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426110335.850 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 11:04	Success	-	View
exp_pytrain.20260426110006.211_20260426_110007 Paper: pytrain.20260426110006.211	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 11:01	Success	-	View
exp_self.20260426105559.849_20260426_105600 Paper: self.20260426105559.849	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426105559.849 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 10:57	Success	-	View
exp_self.20260426104836.848_20260426_104836 Paper: self.20260426104836.848	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426104836.848 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 10:49	Success	-	View
exp_self.20260426104118.847_20260426_104119 Paper: self.20260426104118.847	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426104118.847 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 10:42	Success	-	View
exp_self.20260426103358.846_20260426_103358 Paper: self.20260426103358.846	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426103358.846 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 10:35	Success	-	View
exp_pytrain.20260426102813.210_20260426_102814 Paper: pytrain.20260426102813.210	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 10:29	Success	-	View
exp_self.20260426102620.845_20260426_102620 Paper: self.20260426102620.845	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426102620.845 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 10:27	Success	-	View
exp_self.20260426101902.844_20260426_101902 Paper: self.20260426101902.844	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426101902.844 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 10:20	Success	-	View
exp_self.20260426101145.843_20260426_101146 Paper: self.20260426101145.843	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426101145.843 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 10:12	Success	-	View
exp_self.20260426100425.842_20260426_100425 Paper: self.20260426100425.842	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426100425.842 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 10:05	Success	-	View
exp_self.20260426095703.841_20260426_095704 Paper: self.20260426095703.841	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426095703.841 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 09:58	Success	-	View
exp_pytrain.20260426095441.209_20260426_095441 Paper: pytrain.20260426095441.209	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 09:55	Success	-	View
exp_self.20260426094748.840_20260426_094748 Paper: self.20260426094748.840	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426094748.840 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 09:48	Success	-	View
exp_self.20260426094033.839_20260426_094033 Paper: self.20260426094033.839	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426094033.839 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 09:41	Success	-	View
exp_self.20260426093311.838_20260426_093312 Paper: self.20260426093311.838	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426093311.838 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 09:34	Success	-	View
exp_self.20260426092549.837_20260426_092549 Paper: self.20260426092549.837	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426092549.837 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 09:26	Success	-	View
exp_pytrain.20260426092325.208_20260426_092325 Paper: pytrain.20260426092325.208	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 09:24	Success	-	View
exp_self.20260426091634.836_20260426_091635 Paper: self.20260426091634.836	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426091634.836 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 09:17	Success	-	View
exp_self.20260426090914.835_20260426_090914 Paper: self.20260426090914.835	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426090914.835 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 09:10	Success	-	View
exp_self.20260426090151.834_20260426_090152 Paper: self.20260426090151.834	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426090151.834 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 09:02	Success	-	View
exp_self.20260426085430.833_20260426_085430 Paper: self.20260426085430.833	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426085430.833 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 08:55	Success	-	View
exp_pytrain.20260426085209.207_20260426_085209 Paper: pytrain.20260426085209.207	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 08:53	Success	-	View
exp_self.20260426084519.832_20260426_084520 Paper: self.20260426084519.832	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426084519.832 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 08:46	Success	-	View
exp_self.20260426083758.831_20260426_083758 Paper: self.20260426083758.831	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426083758.831 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 08:39	Success	-	View
exp_self.20260426083033.830_20260426_083033 Paper: self.20260426083033.830	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426083033.830 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 08:31	Success	-	View
exp_self.20260426082312.829_20260426_082313 Paper: self.20260426082312.829	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426082312.829 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 08:24	Success	-	View
exp_pytrain.20260426082052.206_20260426_082053 Paper: pytrain.20260426082052.206	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 08:21	Success	-	View
exp_self.20260426081359.828_20260426_081400 Paper: self.20260426081359.828	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426081359.828 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 08:15	Success	-	View
exp_self.20260426080640.827_20260426_080640 Paper: self.20260426080640.827	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426080640.827 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 08:07	Success	-	View
exp_self.20260426075917.826_20260426_075918 Paper: self.20260426075917.826	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426075917.826 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 08:00	Success	-	View
exp_self.20260426075202.825_20260426_075202 Paper: self.20260426075202.825	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426075202.825 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 07:53	Success	-	View
exp_pytrain.20260426074936.205_20260426_074937 Paper: pytrain.20260426074936.205	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 07:50	Success	-	View
exp_self.20260426074253.824_20260426_074254 Paper: self.20260426074253.824	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426074253.824 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 07:43	Success	-	View
exp_self.20260426073525.823_20260426_073526 Paper: self.20260426073525.823	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426073525.823 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 07:36	Success	-	View
exp_self.20260426072748.822_20260426_072748 Paper: self.20260426072748.822	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426072748.822 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 07:28	Success	-	View
exp_self.20260426072017.821_20260426_072017 Paper: self.20260426072017.821	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426072017.821 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 07:21	Success	-	View
exp_pytrain.20260426071754.204_20260426_071754 Paper: pytrain.20260426071754.204	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 07:18	Success	-	View
exp_self.20260426071052.820_20260426_071053 Paper: self.20260426071052.820	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426071052.820 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 07:11	Success	-	View
exp_self.20260426070333.819_20260426_070333 Paper: self.20260426070333.819	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426070333.819 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 07:04	Success	-	View
exp_self.20260426065612.818_20260426_065612 Paper: self.20260426065612.818	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426065612.818 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 06:57	Success	-	View
exp_self.20260426064852.817_20260426_064852 Paper: self.20260426064852.817	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426064852.817 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 06:49	Success	-	View
exp_pytrain.20260426064632.203_20260426_064633 Paper: pytrain.20260426064632.203	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 06:47	Success	-	View
exp_self.20260426063940.816_20260426_063940 Paper: self.20260426063940.816	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426063940.816 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 06:40	Success	-	View
exp_self.20260426063219.815_20260426_063219 Paper: self.20260426063219.815	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426063219.815 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 06:33	Success	-	View
exp_self.20260426062453.814_20260426_062454 Paper: self.20260426062453.814	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426062453.814 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 06:25	Success	-	View
exp_self.20260426061726.813_20260426_061726 Paper: self.20260426061726.813	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426061726.813 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 06:18	Success	-	View
exp_pytrain.20260426061356.202_20260426_061356 Paper: pytrain.20260426061356.202	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 06:14	Success	-	View
exp_self.20260426060948.812_20260426_060948 Paper: self.20260426060948.812	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426060948.812 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 06:10	Success	-	View
exp_self.20260426060225.811_20260426_060225 Paper: self.20260426060225.811	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426060225.811 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 06:03	Success	-	View
exp_self.20260426055506.810_20260426_055506 Paper: self.20260426055506.810	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426055506.810 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 05:56	Success	-	View
exp_self.20260426054749.809_20260426_054750 Paper: self.20260426054749.809	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426054749.809 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 05:48	Success	-	View
exp_pytrain.20260426054204.201_20260426_054204 Paper: pytrain.20260426054204.201	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 05:43	Success	-	View
exp_self.20260426054010.808_20260426_054010 Paper: self.20260426054010.808	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426054010.808 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 05:41	Success	-	View
exp_self.20260426053253.807_20260426_053254 Paper: self.20260426053253.807	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426053253.807 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 05:33	Success	-	View
exp_self.20260426052534.806_20260426_052534 Paper: self.20260426052534.806	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426052534.806 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 05:26	Success	-	View
exp_self.20260426051816.805_20260426_051816 Paper: self.20260426051816.805	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426051816.805 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 05:19	Success	-	View
exp_self.20260426051052.804_20260426_051052 Paper: self.20260426051052.804	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426051052.804 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 05:11	Success	-	View
exp_pytrain.20260426050832.200_20260426_050832 Paper: pytrain.20260426050832.200	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 05:09	Success	-	View
exp_self.20260426050142.803_20260426_050143 Paper: self.20260426050142.803	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426050142.803 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 05:02	Success	-	View
exp_self.20260426045419.802_20260426_045419 Paper: self.20260426045419.802	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426045419.802 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 04:55	Success	-	View
exp_self.20260426044652.801_20260426_044653 Paper: self.20260426044652.801	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426044652.801 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 04:47	Success	-	View
exp_self.20260426043931.800_20260426_043932 Paper: self.20260426043931.800	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426043931.800 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 04:40	Success	-	View
exp_pytrain.20260426043709.199_20260426_043710 Paper: pytrain.20260426043709.199	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 04:38	Success	-	View
exp_self.20260426043015.799_20260426_043015 Paper: self.20260426043015.799	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426043015.799 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 04:31	Success	-	View
exp_self.20260426042255.798_20260426_042255 Paper: self.20260426042255.798	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426042255.798 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 04:23	Success	-	View
exp_self.20260426041534.797_20260426_041534 Paper: self.20260426041534.797	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426041534.797 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 04:16	Success	-	View
exp_self.20260426040809.796_20260426_040810 Paper: self.20260426040809.796	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426040809.796 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 04:09	Success	-	View
exp_pytrain.20260426040547.198_20260426_040547 Paper: pytrain.20260426040547.198	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 04:06	Success	-	View
exp_self.20260426035852.795_20260426_035852 Paper: self.20260426035852.795	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426035852.795 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 03:59	Success	-	View
exp_self.20260426035127.794_20260426_035128 Paper: self.20260426035127.794	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426035127.794 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 03:52	Success	-	View
exp_self.20260426034406.793_20260426_034406 Paper: self.20260426034406.793	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426034406.793 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 03:45	Success	-	View
exp_self.20260426033644.792_20260426_033644 Paper: self.20260426033644.792	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426033644.792 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 03:37	Success	-	View
exp_pytrain.20260426033421.197_20260426_033422 Paper: pytrain.20260426033421.197	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 03:35	Success	-	View
exp_self.20260426032734.791_20260426_032735 Paper: self.20260426032734.791	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426032734.791 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 03:28	Success	-	View
exp_self.20260426032013.790_20260426_032014 Paper: self.20260426032013.790	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426032013.790 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 03:21	Success	-	View
exp_self.20260426031250.789_20260426_031251 Paper: self.20260426031250.789	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426031250.789 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 03:13	Success	-	View
exp_self.20260426030515.788_20260426_030515 Paper: self.20260426030515.788	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426030515.788 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 03:06	Success	-	View
exp_pytrain.20260426030254.196_20260426_030254 Paper: pytrain.20260426030254.196	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 03:03	Success	-	View
exp_self.20260426025604.787_20260426_025605 Paper: self.20260426025604.787	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426025604.787 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 02:57	Success	-	View
exp_self.20260426024843.786_20260426_024843 Paper: self.20260426024843.786	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426024843.786 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 02:49	Success	-	View
exp_self.20260426024118.785_20260426_024119 Paper: self.20260426024118.785	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426024118.785 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 02:42	Success	-	View
exp_self.20260426023356.784_20260426_023356 Paper: self.20260426023356.784	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426023356.784 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 02:34	Success	-	View
exp_pytrain.20260426023135.195_20260426_023135 Paper: pytrain.20260426023135.195	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 02:32	Success	-	View
exp_self.20260426022439.783_20260426_022440 Paper: self.20260426022439.783	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426022439.783 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 02:25	Success	-	View
exp_self.20260426021720.782_20260426_021720 Paper: self.20260426021720.782	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426021720.782 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 02:18	Success	-	View
exp_self.20260426021001.781_20260426_021001 Paper: self.20260426021001.781	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426021001.781 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 02:11	Success	-	View
exp_self.20260426020237.780_20260426_020237 Paper: self.20260426020237.780	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426020237.780 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 02:03	Success	-	View
exp_pytrain.20260426015905.194_20260426_015905 Paper: pytrain.20260426015905.194	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 02:00	Success	-	View
exp_self.20260426015456.779_20260426_015456 Paper: self.20260426015456.779	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426015456.779 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 01:55	Success	-	View
exp_self.20260426014732.778_20260426_014732 Paper: self.20260426014732.778	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426014732.778 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 01:48	Success	-	View
exp_self.20260426014011.777_20260426_014011 Paper: self.20260426014011.777	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426014011.777 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 01:41	Success	-	View
exp_gh_dilberx_universal-llm-telemetry-suite_20260426_013726 Paper: gh_dilberx_universal-llm-telemetry-suite	dilberx/universal-llm-telemetry-suite Paper ID: gh_dilberx_universal-llm-telemetry-suite - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected S...	04-26 01:38	Success	-	View
exp_self.20260426013030.776_20260426_013030 Paper: self.20260426013030.776	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426013030.776 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 01:31	Success	-	View
exp_pytrain.20260426012700.193_20260426_012700 Paper: pytrain.20260426012700.193	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 01:28	Success	-	View
exp_self.20260426012250.775_20260426_012250 Paper: self.20260426012250.775	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426012250.775 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 01:23	Success	-	View
exp_self.20260426011526.774_20260426_011527 Paper: self.20260426011526.774	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426011526.774 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 01:16	Success	-	View
exp_self.20260426010807.773_20260426_010807 Paper: self.20260426010807.773	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426010807.773 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 01:09	Success	-	View
exp_self.20260426010048.772_20260426_010049 Paper: self.20260426010048.772	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426010048.772 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 01:01	Success	-	View
exp_pytrain.20260426005503.192_20260426_005504 Paper: pytrain.20260426005503.192	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 00:56	Success	-	View
exp_self.20260426005308.771_20260426_005308 Paper: self.20260426005308.771	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426005308.771 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 00:54	Success	-	View
exp_self.20260426004549.770_20260426_004550 Paper: self.20260426004549.770	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426004549.770 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 00:46	Success	-	View
exp_self.20260426003834.769_20260426_003835 Paper: self.20260426003834.769	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426003834.769 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 00:39	Success	-	View
exp_self.20260426003115.768_20260426_003116 Paper: self.20260426003115.768	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426003115.768 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 00:32	Success	-	View
exp_self.20260426002353.767_20260426_002353 Paper: self.20260426002353.767	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426002353.767 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 00:24	Success	-	View
exp_pytrain.20260426002130.191_20260426_002131 Paper: pytrain.20260426002130.191	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-26 00:22	Success	-	View
exp_self.20260426001441.766_20260426_001441 Paper: self.20260426001441.766	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426001441.766 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 00:15	Success	-	View
exp_self.20260426000722.765_20260426_000723 Paper: self.20260426000722.765	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260426000722.765 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-26 00:08	Success	-	View
exp_cr_10.24143_2072-9502-2026-2-111-120_20260426_000407 Paper: cr_10.24143_2072-9502-2026-2-111-120	Fuzzy logic-based model for information security risk assessment of a territorially distributed internal affairs system Paper ID: cr_10.24143_2072-9502-2026-2-111-120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signa...	04-26 00:05	Success	-	View
exp_cr_10.24143_2072-9502-2026-2-85-93_20260426_000043 Paper: cr_10.24143_2072-9502-2026-2-85-93	Optimizing the YOLO model for NPU operation Paper ID: cr_10.24143_2072-9502-2026-2-85-93 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal:...	04-26 00:01	Success	-	View
exp_self.20260425235838.764_20260425_235838 Paper: self.20260425235838.764	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425235838.764 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 23:59	Success	-	View
exp_self.20260425235120.763_20260425_235120 Paper: self.20260425235120.763	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425235120.763 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 23:52	Success	-	View
exp_pytrain.20260425234851.190_20260425_234852 Paper: pytrain.20260425234851.190	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 23:49	Success	-	View
exp_self.20260425234207.762_20260425_234208 Paper: self.20260425234207.762	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425234207.762 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 23:43	Success	-	View
exp_self.20260425233445.761_20260425_233446 Paper: self.20260425233445.761	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425233445.761 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 23:35	Success	-	View
exp_self.20260425232722.760_20260425_232722 Paper: self.20260425232722.760	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425232722.760 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 23:28	Success	-	View
exp_gh_eslammoha8625_llmtest-perf_20260425_232157 Paper: gh_eslammoha8625_llmtest-perf	eslammoha8625/llmtest-perf Paper ID: gh_eslammoha8625_llmtest-perf - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-25 23:22	Success	-	View
exp_self.20260425231954.759_20260425_231954 Paper: self.20260425231954.759	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425231954.759 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 23:20	Success	-	View
exp_pytrain.20260425231729.189_20260425_231730 Paper: pytrain.20260425231729.189	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 23:18	Success	-	View
exp_self.20260425231041.758_20260425_231042 Paper: self.20260425231041.758	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425231041.758 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 23:11	Success	-	View
exp_self.20260425230320.757_20260425_230320 Paper: self.20260425230320.757	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425230320.757 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 23:04	Success	-	View
exp_self.20260425225600.756_20260425_225601 Paper: self.20260425225600.756	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425225600.756 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 22:57	Success	-	View
exp_self.20260425224837.755_20260425_224837 Paper: self.20260425224837.755	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425224837.755 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 22:49	Success	-	View
exp_pytrain.20260425224608.188_20260425_224609 Paper: pytrain.20260425224608.188	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 22:47	Success	-	View
exp_self.20260425223923.754_20260425_223924 Paper: self.20260425223923.754	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425223923.754 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 22:40	Success	-	View
exp_self.20260425223159.753_20260425_223159 Paper: self.20260425223159.753	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425223159.753 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 22:33	Success	-	View
exp_self.20260425222438.752_20260425_222438 Paper: self.20260425222438.752	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425222438.752 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 22:25	Success	-	View
exp_self.20260425221715.751_20260425_221715 Paper: self.20260425221715.751	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425221715.751 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 22:18	Success	-	View
exp_pytrain.20260425221443.187_20260425_221443 Paper: pytrain.20260425221443.187	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 22:15	Success	-	View
exp_self.20260425220755.750_20260425_220755 Paper: self.20260425220755.750	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425220755.750 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 22:08	Success	-	View
exp_self.20260425220030.749_20260425_220030 Paper: self.20260425220030.749	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425220030.749 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 22:01	Success	-	View
exp_self.20260425215301.748_20260425_215302 Paper: self.20260425215301.748	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425215301.748 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 21:54	Success	-	View
exp_self.20260425214541.747_20260425_214541 Paper: self.20260425214541.747	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425214541.747 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 21:46	Success	-	View
exp_pytrain.20260425214314.186_20260425_214314 Paper: pytrain.20260425214314.186	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 21:44	Success	-	View
exp_self.20260425213626.746_20260425_213627 Paper: self.20260425213626.746	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425213626.746 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 21:37	Success	-	View
exp_self.20260425212857.745_20260425_212857 Paper: self.20260425212857.745	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425212857.745 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 21:29	Success	-	View
exp_self.20260425212125.744_20260425_212126 Paper: self.20260425212125.744	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425212125.744 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 21:22	Success	-	View
exp_self.20260425211405.743_20260425_211405 Paper: self.20260425211405.743	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425211405.743 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 21:15	Success	-	View
exp_pytrain.20260425211136.185_20260425_211136 Paper: pytrain.20260425211136.185	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 21:12	Success	-	View
exp_self.20260425210446.742_20260425_210446 Paper: self.20260425210446.742	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425210446.742 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 21:05	Success	-	View
exp_self.20260425205716.741_20260425_205717 Paper: self.20260425205716.741	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425205716.741 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 20:58	Success	-	View
exp_self.20260425204946.740_20260425_204947 Paper: self.20260425204946.740	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425204946.740 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 20:50	Success	-	View
exp_self.20260425204221.739_20260425_204221 Paper: self.20260425204221.739	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425204221.739 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 20:43	Success	-	View
exp_pytrain.20260425203958.184_20260425_203958 Paper: pytrain.20260425203958.184	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 20:41	Success	-	View
exp_self.20260425203304.738_20260425_203304 Paper: self.20260425203304.738	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425203304.738 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 20:34	Success	-	View
exp_self.20260425202538.737_20260425_202539 Paper: self.20260425202538.737	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425202538.737 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 20:26	Success	-	View
exp_self.20260425201812.736_20260425_201812 Paper: self.20260425201812.736	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425201812.736 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 20:19	Success	-	View
exp_self.20260425201045.735_20260425_201045 Paper: self.20260425201045.735	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425201045.735 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 20:11	Success	-	View
exp_pytrain.20260425200823.183_20260425_200824 Paper: pytrain.20260425200823.183	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 20:09	Success	-	View
exp_self.20260425200124.734_20260425_200125 Paper: self.20260425200124.734	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425200124.734 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 20:02	Success	-	View
exp_self.20260425195403.733_20260425_195403 Paper: self.20260425195403.733	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425195403.733 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 19:55	Success	-	View
exp_gh_Ac3v3d0_semafold_20260425_194941 Paper: gh_Ac3v3d0_semafold	Ac3v3d0/semafold Paper ID: gh_Ac3v3d0_semafold - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benc...	04-25 19:50	Success	-	View
exp_self.20260425194632.732_20260425_194633 Paper: self.20260425194632.732	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425194632.732 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 19:47	Success	-	View
exp_self.20260425193852.731_20260425_193852 Paper: self.20260425193852.731	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425193852.731 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 19:39	Success	-	View
exp_pytrain.20260425193623.182_20260425_193624 Paper: pytrain.20260425193623.182	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 19:37	Success	-	View
exp_self.20260425192934.730_20260425_192935 Paper: self.20260425192934.730	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425192934.730 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 19:30	Success	-	View
exp_self.20260425192211.729_20260425_192211 Paper: self.20260425192211.729	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425192211.729 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 19:23	Success	-	View
exp_self.20260425191447.728_20260425_191448 Paper: self.20260425191447.728	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425191447.728 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 19:15	Success	-	View
exp_self.20260425190726.727_20260425_190727 Paper: self.20260425190726.727	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425190726.727 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 19:08	Success	-	View
exp_pytrain.20260425190500.181_20260425_190500 Paper: pytrain.20260425190500.181	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 19:06	Success	-	View
exp_self.20260425185801.726_20260425_185802 Paper: self.20260425185801.726	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425185801.726 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 18:59	Success	-	View
exp_self.20260425185033.725_20260425_185034 Paper: self.20260425185033.725	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425185033.725 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 18:51	Success	-	View
exp_self.20260425184307.724_20260425_184308 Paper: self.20260425184307.724	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425184307.724 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 18:44	Success	-	View
exp_self.20260425183544.723_20260425_183544 Paper: self.20260425183544.723	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425183544.723 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 18:36	Success	-	View
exp_pytrain.20260425183316.180_20260425_183316 Paper: pytrain.20260425183316.180	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 18:34	Success	-	View
exp_self.20260425182627.722_20260425_182628 Paper: self.20260425182627.722	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425182627.722 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 18:27	Success	-	View
exp_self.20260425181859.721_20260425_181900 Paper: self.20260425181859.721	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425181859.721 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 18:20	Success	-	View
exp_self.20260425181132.720_20260425_181133 Paper: self.20260425181132.720	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425181132.720 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 18:12	Success	-	View
exp_self.20260425180402.719_20260425_180402 Paper: self.20260425180402.719	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425180402.719 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 18:05	Success	-	View
exp_pytrain.20260425180138.179_20260425_180138 Paper: pytrain.20260425180138.179	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 18:02	Success	-	View
exp_self.20260425175448.718_20260425_175449 Paper: self.20260425175448.718	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425175448.718 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 17:55	Success	-	View
exp_self.20260425174726.717_20260425_174727 Paper: self.20260425174726.717	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425174726.717 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 17:48	Success	-	View
exp_self.20260425174001.716_20260425_174001 Paper: self.20260425174001.716	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425174001.716 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 17:41	Success	-	View
exp_self.20260425173230.715_20260425_173230 Paper: self.20260425173230.715	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425173230.715 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 17:33	Success	-	View
exp_pytrain.20260425173007.178_20260425_173007 Paper: pytrain.20260425173007.178	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 17:31	Success	-	View
exp_self.20260425172317.714_20260425_172317 Paper: self.20260425172317.714	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425172317.714 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 17:24	Success	-	View
exp_self.20260425171556.713_20260425_171557 Paper: self.20260425171556.713	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425171556.713 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 17:16	Success	-	View
exp_self.20260425170833.712_20260425_170833 Paper: self.20260425170833.712	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425170833.712 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 17:09	Success	-	View
exp_self.20260425170101.711_20260425_170101 Paper: self.20260425170101.711	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425170101.711 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 17:02	Success	-	View
exp_pytrain.20260425165838.177_20260425_165839 Paper: pytrain.20260425165838.177	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 16:59	Success	-	View
exp_self.20260425165143.710_20260425_165144 Paper: self.20260425165143.710	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425165143.710 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 16:52	Success	-	View
exp_self.20260425164422.709_20260425_164423 Paper: self.20260425164422.709	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425164422.709 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 16:45	Success	-	View
exp_self.20260425163659.708_20260425_163700 Paper: self.20260425163659.708	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425163659.708 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 16:38	Success	-	View
exp_self.20260425162934.707_20260425_162934 Paper: self.20260425162934.707	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425162934.707 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 16:30	Success	-	View
exp_pytrain.20260425162710.176_20260425_162710 Paper: pytrain.20260425162710.176	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 16:28	Success	-	View
exp_self.20260425162014.706_20260425_162014 Paper: self.20260425162014.706	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425162014.706 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 16:21	Success	-	View
exp_self.20260425161254.705_20260425_161254 Paper: self.20260425161254.705	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425161254.705 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 16:13	Success	-	View
exp_self.20260425160532.704_20260425_160533 Paper: self.20260425160532.704	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425160532.704 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 16:06	Success	-	View
exp_self.20260425155807.703_20260425_155808 Paper: self.20260425155807.703	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425155807.703 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 15:59	Success	-	View
exp_pytrain.20260425155541.175_20260425_155541 Paper: pytrain.20260425155541.175	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 15:56	Success	-	View
exp_self.20260425154842.702_20260425_154843 Paper: self.20260425154842.702	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425154842.702 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 15:49	Success	-	View
exp_self.20260425154114.701_20260425_154114 Paper: self.20260425154114.701	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425154114.701 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 15:42	Success	-	View
exp_self.20260425153351.700_20260425_153351 Paper: self.20260425153351.700	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425153351.700 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 15:34	Success	-	View
exp_self.20260425152626.699_20260425_152626 Paper: self.20260425152626.699	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425152626.699 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 15:27	Success	-	View
exp_pytrain.20260425152356.174_20260425_152357 Paper: pytrain.20260425152356.174	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 15:24	Success	-	View
exp_self.20260425151708.698_20260425_151708 Paper: self.20260425151708.698	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425151708.698 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 15:18	Success	-	View
exp_self.20260425150939.697_20260425_150939 Paper: self.20260425150939.697	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425150939.697 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 15:10	Success	-	View
exp_self.20260425150211.696_20260425_150211 Paper: self.20260425150211.696	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425150211.696 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 15:03	Success	-	View
exp_self.20260425145443.695_20260425_145443 Paper: self.20260425145443.695	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425145443.695 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 14:55	Success	-	View
exp_pytrain.20260425145213.173_20260425_145214 Paper: pytrain.20260425145213.173	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 14:53	Success	-	View
exp_self.20260425144523.694_20260425_144524 Paper: self.20260425144523.694	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425144523.694 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 14:46	Success	-	View
exp_self.20260425143752.693_20260425_143753 Paper: self.20260425143752.693	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425143752.693 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 14:38	Success	-	View
exp_self.20260425143027.692_20260425_143027 Paper: self.20260425143027.692	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425143027.692 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 14:31	Success	-	View
exp_self.20260425142259.691_20260425_142259 Paper: self.20260425142259.691	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425142259.691 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 14:24	Success	-	View
exp_pytrain.20260425142032.172_20260425_142032 Paper: pytrain.20260425142032.172	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 14:21	Success	-	View
exp_self.20260425141334.690_20260425_141335 Paper: self.20260425141334.690	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425141334.690 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 14:14	Success	-	View
exp_self.20260425140558.689_20260425_140558 Paper: self.20260425140558.689	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425140558.689 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 14:07	Success	-	View
exp_self.20260425135824.688_20260425_135825 Paper: self.20260425135824.688	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425135824.688 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 13:59	Success	-	View
exp_self.20260425135058.687_20260425_135059 Paper: self.20260425135058.687	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425135058.687 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 13:52	Success	-	View
exp_pytrain.20260425134830.171_20260425_134831 Paper: pytrain.20260425134830.171	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 13:49	Success	-	View
exp_self.20260425134135.686_20260425_134135 Paper: self.20260425134135.686	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425134135.686 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 13:42	Success	-	View
exp_self.20260425133405.685_20260425_133405 Paper: self.20260425133405.685	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425133405.685 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 13:35	Success	-	View
exp_self.20260425132637.684_20260425_132637 Paper: self.20260425132637.684	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425132637.684 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 13:27	Success	-	View
exp_self.20260425131914.683_20260425_131914 Paper: self.20260425131914.683	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425131914.683 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 13:20	Success	-	View
exp_pytrain.20260425131647.170_20260425_131648 Paper: pytrain.20260425131647.170	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 13:17	Success	-	View
exp_self.20260425130954.682_20260425_130954 Paper: self.20260425130954.682	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425130954.682 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 13:10	Success	-	View
exp_self.20260425130221.681_20260425_130221 Paper: self.20260425130221.681	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425130221.681 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 13:03	Success	-	View
exp_self.20260425125459.680_20260425_125459 Paper: self.20260425125459.680	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425125459.680 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 12:56	Success	-	View
exp_self.20260425124733.679_20260425_124734 Paper: self.20260425124733.679	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425124733.679 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 12:48	Success	-	View
exp_pytrain.20260425124515.169_20260425_124515 Paper: pytrain.20260425124515.169	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 12:46	Success	-	View
exp_self.20260425123822.678_20260425_123823 Paper: self.20260425123822.678	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425123822.678 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 12:39	Success	-	View
exp_self.20260425123101.677_20260425_123101 Paper: self.20260425123101.677	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425123101.677 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 12:32	Success	-	View
exp_self.20260425122333.676_20260425_122334 Paper: self.20260425122333.676	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425122333.676 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 12:24	Success	-	View
exp_self.20260425121607.675_20260425_121607 Paper: self.20260425121607.675	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425121607.675 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 12:17	Success	-	View
exp_pytrain.20260425121346.168_20260425_121346 Paper: pytrain.20260425121346.168	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 12:14	Success	-	View
exp_self.20260425120648.674_20260425_120648 Paper: self.20260425120648.674	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425120648.674 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 12:07	Success	-	View
exp_self.20260425115926.673_20260425_115927 Paper: self.20260425115926.673	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425115926.673 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 12:00	Success	-	View
exp_self.20260425115155.672_20260425_115155 Paper: self.20260425115155.672	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425115155.672 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 11:52	Success	-	View
exp_self.20260425114417.671_20260425_114418 Paper: self.20260425114417.671	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425114417.671 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 11:45	Success	-	View
exp_pytrain.20260425114151.167_20260425_114151 Paper: pytrain.20260425114151.167	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 11:42	Success	-	View
exp_self.20260425113545.670_20260425_113545 Paper: self.20260425113545.670	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425113545.670 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 11:36	Success	-	View
exp_self.20260425112819.669_20260425_112820 Paper: self.20260425112819.669	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425112819.669 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 11:29	Success	-	View
exp_self.20260425112050.668_20260425_112050 Paper: self.20260425112050.668	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425112050.668 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 11:21	Success	-	View
exp_self.20260425111304.667_20260425_111305 Paper: self.20260425111304.667	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425111304.667 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 11:14	Success	-	View
exp_pytrain.20260425111035.166_20260425_111036 Paper: pytrain.20260425111035.166	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 11:11	Success	-	View
exp_self.20260425110336.666_20260425_110336 Paper: self.20260425110336.666	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425110336.666 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 11:04	Success	-	View
exp_self.20260425105606.665_20260425_105607 Paper: self.20260425105606.665	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425105606.665 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 10:57	Success	-	View
exp_self.20260425104835.664_20260425_104835 Paper: self.20260425104835.664	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425104835.664 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 10:49	Success	-	View
exp_self.20260425104059.663_20260425_104059 Paper: self.20260425104059.663	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425104059.663 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 10:42	Success	-	View
exp_pytrain.20260425103825.165_20260425_103826 Paper: pytrain.20260425103825.165	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 10:39	Success	-	View
exp_self.20260425103130.662_20260425_103131 Paper: self.20260425103130.662	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425103130.662 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 10:32	Success	-	View
exp_self.20260425102351.661_20260425_102352 Paper: self.20260425102351.661	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425102351.661 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 10:24	Success	-	View
exp_self.20260425101621.660_20260425_101621 Paper: self.20260425101621.660	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425101621.660 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 10:17	Success	-	View
exp_self.20260425100844.659_20260425_100844 Paper: self.20260425100844.659	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425100844.659 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 10:09	Success	-	View
exp_pytrain.20260425100606.164_20260425_100607 Paper: pytrain.20260425100606.164	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 10:07	Success	-	View
exp_self.20260425095919.658_20260425_095920 Paper: self.20260425095919.658	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425095919.658 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 10:00	Success	-	View
exp_self.20260425095149.657_20260425_095149 Paper: self.20260425095149.657	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425095149.657 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 09:52	Success	-	View
exp_self.20260425094421.656_20260425_094421 Paper: self.20260425094421.656	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425094421.656 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 09:45	Success	-	View
exp_self.20260425093658.655_20260425_093658 Paper: self.20260425093658.655	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425093658.655 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 09:38	Success	-	View
exp_pytrain.20260425093433.163_20260425_093434 Paper: pytrain.20260425093433.163	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 09:35	Success	-	View
exp_self.20260425092739.654_20260425_092740 Paper: self.20260425092739.654	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425092739.654 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 09:28	Success	-	View
exp_self.20260425092008.653_20260425_092008 Paper: self.20260425092008.653	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425092008.653 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 09:21	Success	-	View
exp_self.20260425091236.652_20260425_091237 Paper: self.20260425091236.652	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425091236.652 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 09:13	Success	-	View
exp_self.20260425090515.651_20260425_090516 Paper: self.20260425090515.651	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425090515.651 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 09:06	Success	-	View
exp_pytrain.20260425090249.162_20260425_090250 Paper: pytrain.20260425090249.162	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 09:03	Success	-	View
exp_self.20260425085830.650_20260425_085831 Paper: self.20260425085830.650	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425085830.650 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 08:59	Success	-	View
exp_self.20260425085107.649_20260425_085107 Paper: self.20260425085107.649	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425085107.649 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 08:52	Success	-	View
exp_cr_10.1177_01466453251412512_20260425_084817 Paper: cr_10.1177_01466453251412512	A short review of published multi-model inference studies in radiation epidemiology and some new developments Paper ID: cr_10.1177_01466453251412512 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recov...	04-25 08:49	Success	-	View
exp_self.20260425084118.648_20260425_084119 Paper: self.20260425084118.648	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425084118.648 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 08:42	Success	-	View
exp_self.20260425083355.647_20260425_083356 Paper: self.20260425083355.647	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425083355.647 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 08:34	Success	-	View
exp_pytrain.20260425083134.161_20260425_083135 Paper: pytrain.20260425083134.161	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 08:32	Success	-	View
exp_self.20260425082439.646_20260425_082440 Paper: self.20260425082439.646	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425082439.646 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 08:25	Success	-	View
exp_self.20260425081718.645_20260425_081719 Paper: self.20260425081718.645	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425081718.645 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 08:18	Success	-	View
exp_self.20260425080954.644_20260425_080954 Paper: self.20260425080954.644	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425080954.644 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 08:10	Success	-	View
exp_self.20260425080229.643_20260425_080229 Paper: self.20260425080229.643	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425080229.643 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 08:03	Success	-	View
exp_pytrain.20260425080009.160_20260425_080010 Paper: pytrain.20260425080009.160	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 08:01	Success	-	View
exp_self.20260425075312.642_20260425_075313 Paper: self.20260425075312.642	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425075312.642 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 07:54	Success	-	View
exp_self.20260425074553.641_20260425_074554 Paper: self.20260425074553.641	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425074553.641 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 07:46	Success	-	View
exp_self.20260425073825.640_20260425_073826 Paper: self.20260425073825.640	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425073825.640 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 07:39	Success	-	View
exp_self.20260425073101.639_20260425_073102 Paper: self.20260425073101.639	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425073101.639 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 07:32	Success	-	View
exp_pytrain.20260425072840.159_20260425_072841 Paper: pytrain.20260425072840.159	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 07:29	Success	-	View
exp_self.20260425072147.638_20260425_072147 Paper: self.20260425072147.638	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425072147.638 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 07:22	Success	-	View
exp_self.20260425071428.637_20260425_071428 Paper: self.20260425071428.637	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425071428.637 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 07:15	Success	-	View
exp_self.20260425070635.636_20260425_070636 Paper: self.20260425070635.636	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425070635.636 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 07:07	Success	-	View
exp_self.20260425065841.635_20260425_065841 Paper: self.20260425065841.635	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425065841.635 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 06:59	Success	-	View
exp_pytrain.20260425065557.158_20260425_065557 Paper: pytrain.20260425065557.158	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 06:57	Success	-	View
exp_self.20260425065015.634_20260425_065016 Paper: self.20260425065015.634	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425065015.634 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 06:51	Success	-	View
exp_self.20260425064220.633_20260425_064220 Paper: self.20260425064220.633	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425064220.633 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 06:43	Success	-	View
exp_self.20260425063435.632_20260425_063435 Paper: self.20260425063435.632	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425063435.632 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 06:35	Success	-	View
exp_self.20260425062658.631_20260425_062658 Paper: self.20260425062658.631	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425062658.631 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 06:28	Success	-	View
exp_pytrain.20260425062421.157_20260425_062422 Paper: pytrain.20260425062421.157	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 06:25	Success	-	View
exp_self.20260425061725.630_20260425_061725 Paper: self.20260425061725.630	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425061725.630 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 06:18	Success	-	View
exp_self.20260425061000.629_20260425_061000 Paper: self.20260425061000.629	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425061000.629 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 06:11	Success	-	View
exp_self.20260425060232.628_20260425_060232 Paper: self.20260425060232.628	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425060232.628 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 06:03	Success	-	View
exp_self.20260425055513.627_20260425_055513 Paper: self.20260425055513.627	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425055513.627 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 05:56	Success	-	View
exp_pytrain.20260425055252.156_20260425_055253 Paper: pytrain.20260425055252.156	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 05:53	Success	-	View
exp_self.20260425054558.626_20260425_054558 Paper: self.20260425054558.626	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425054558.626 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 05:47	Success	-	View
exp_self.20260425053832.625_20260425_053833 Paper: self.20260425053832.625	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425053832.625 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 05:39	Success	-	View
exp_self.20260425053108.624_20260425_053108 Paper: self.20260425053108.624	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425053108.624 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 05:32	Success	-	View
exp_self.20260425052343.623_20260425_052344 Paper: self.20260425052343.623	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425052343.623 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 05:24	Success	-	View
exp_pytrain.20260425052125.155_20260425_052125 Paper: pytrain.20260425052125.155	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 05:22	Success	-	View
exp_self.20260425051431.622_20260425_051431 Paper: self.20260425051431.622	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425051431.622 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 05:15	Success	-	View
exp_self.20260425050703.621_20260425_050704 Paper: self.20260425050703.621	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425050703.621 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 05:08	Success	-	View
exp_self.20260425045935.620_20260425_045935 Paper: self.20260425045935.620	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425045935.620 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 05:00	Success	-	View
exp_self.20260425045207.619_20260425_045207 Paper: self.20260425045207.619	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425045207.619 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 04:53	Success	-	View
exp_pytrain.20260425044947.154_20260425_044948 Paper: pytrain.20260425044947.154	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 04:50	Success	-	View
exp_self.20260425044253.618_20260425_044253 Paper: self.20260425044253.618	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425044253.618 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 04:43	Success	-	View
exp_self.20260425043535.617_20260425_043535 Paper: self.20260425043535.617	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425043535.617 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 04:36	Success	-	View
exp_self.20260425042813.616_20260425_042813 Paper: self.20260425042813.616	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425042813.616 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 04:29	Success	-	View
exp_self.20260425042046.615_20260425_042046 Paper: self.20260425042046.615	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425042046.615 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 04:21	Success	-	View
exp_pytrain.20260425041823.153_20260425_041824 Paper: pytrain.20260425041823.153	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 04:19	Success	-	View
exp_self.20260425041127.614_20260425_041128 Paper: self.20260425041127.614	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425041127.614 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 04:12	Success	-	View
exp_self.20260425040406.613_20260425_040406 Paper: self.20260425040406.613	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425040406.613 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 04:05	Success	-	View
exp_self.20260425035644.612_20260425_035645 Paper: self.20260425035644.612	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425035644.612 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 03:57	Success	-	View
exp_self.20260425034919.611_20260425_034920 Paper: self.20260425034919.611	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425034919.611 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 03:50	Success	-	View
exp_pytrain.20260425034655.152_20260425_034656 Paper: pytrain.20260425034655.152	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 03:47	Success	-	View
exp_self.20260425034001.610_20260425_034001 Paper: self.20260425034001.610	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425034001.610 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 03:41	Success	-	View
exp_self.20260425033241.609_20260425_033241 Paper: self.20260425033241.609	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425033241.609 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 03:33	Success	-	View
exp_self.20260425032519.608_20260425_032519 Paper: self.20260425032519.608	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425032519.608 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 03:26	Success	-	View
exp_self.20260425031755.607_20260425_031756 Paper: self.20260425031755.607	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425031755.607 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 03:18	Success	-	View
exp_pytrain.20260425031525.151_20260425_031525 Paper: pytrain.20260425031525.151	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 03:16	Success	-	View
exp_self.20260425030840.606_20260425_030840 Paper: self.20260425030840.606	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425030840.606 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 03:09	Success	-	View
exp_self.20260425030115.605_20260425_030115 Paper: self.20260425030115.605	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425030115.605 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 03:02	Success	-	View
exp_self.20260425025357.604_20260425_025357 Paper: self.20260425025357.604	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425025357.604 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 02:55	Success	-	View
exp_self.20260425024639.603_20260425_024639 Paper: self.20260425024639.603	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425024639.603 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 02:47	Success	-	View
exp_pytrain.20260425024408.150_20260425_024409 Paper: pytrain.20260425024408.150	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 02:45	Success	-	View
exp_self.20260425023716.602_20260425_023717 Paper: self.20260425023716.602	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425023716.602 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 02:38	Success	-	View
exp_self.20260425022952.601_20260425_022952 Paper: self.20260425022952.601	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425022952.601 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 02:30	Success	-	View
exp_self.20260425022234.600_20260425_022235 Paper: self.20260425022234.600	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425022234.600 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 02:23	Success	-	View
exp_self.20260425021513.599_20260425_021513 Paper: self.20260425021513.599	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425021513.599 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 02:16	Success	-	View
exp_pytrain.20260425021248.149_20260425_021248 Paper: pytrain.20260425021248.149	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 02:13	Success	-	View
exp_self.20260425020603.598_20260425_020603 Paper: self.20260425020603.598	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425020603.598 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 02:07	Success	-	View
exp_self.20260425015841.597_20260425_015842 Paper: self.20260425015841.597	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425015841.597 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 01:59	Success	-	View
exp_self.20260425015116.596_20260425_015117 Paper: self.20260425015116.596	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425015116.596 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 01:52	Success	-	View
exp_cr_10.55041_ijsmt.v2i4.199_20260425_014809 Paper: cr_10.55041_ijsmt.v2i4.199	AI-Driven Resume Skill Extraction and Job Recommendation System using Hybrid Transformer Mamba Model Paper ID: cr_10.55041_ijsmt.v2i4.199 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recover...	04-25 01:49	Success	-	View
exp_cr_10.1038_s41598-026-49734-2_20260425_014439 Paper: cr_10.1038_s41598-026-49734-2	A multi-cognitive PCB defect detection model integrating Mamba Paper ID: cr_10.1038_s41598-026-49734-2 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-25 01:45	Success	-	View
exp_self.20260425014239.595_20260425_014239 Paper: self.20260425014239.595	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425014239.595 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 01:43	Success	-	View
exp_pytrain.20260425014015.148_20260425_014016 Paper: pytrain.20260425014015.148	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 01:41	Success	-	View
exp_self.20260425013324.594_20260425_013324 Paper: self.20260425013324.594	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425013324.594 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 01:34	Success	-	View
exp_self.20260425012558.593_20260425_012559 Paper: self.20260425012558.593	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425012558.593 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 01:27	Success	-	View
exp_self.20260425011835.592_20260425_011835 Paper: self.20260425011835.592	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425011835.592 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 01:19	Success	-	View
exp_self.20260425011116.591_20260425_011116 Paper: self.20260425011116.591	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425011116.591 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 01:12	Success	-	View
exp_pytrain.20260425010855.147_20260425_010856 Paper: pytrain.20260425010855.147	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 01:09	Success	-	View
exp_self.20260425010200.590_20260425_010200 Paper: self.20260425010200.590	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425010200.590 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 01:03	Success	-	View
exp_self.20260425005423.589_20260425_005423 Paper: self.20260425005423.589	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425005423.589 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 00:55	Success	-	View
exp_self.20260425004701.588_20260425_004701 Paper: self.20260425004701.588	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425004701.588 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 00:48	Success	-	View
exp_self.20260425003933.587_20260425_003934 Paper: self.20260425003933.587	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425003933.587 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 00:40	Success	-	View
exp_pytrain.20260425003715.146_20260425_003715 Paper: pytrain.20260425003715.146	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 00:38	Success	-	View
exp_self.20260425003016.586_20260425_003017 Paper: self.20260425003016.586	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425003016.586 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 00:31	Success	-	View
exp_self.20260425002255.585_20260425_002255 Paper: self.20260425002255.585	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425002255.585 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 00:23	Success	-	View
exp_self.20260425001533.584_20260425_001533 Paper: self.20260425001533.584	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425001533.584 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 00:16	Success	-	View
exp_self.20260425000816.583_20260425_000816 Paper: self.20260425000816.583	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260425000816.583 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 00:09	Success	-	View
exp_pytrain.20260425000551.145_20260425_000552 Paper: pytrain.20260425000551.145	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-25 00:06	Success	-	View
exp_self.20260424235859.582_20260424_235859 Paper: self.20260424235859.582	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424235859.582 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-25 00:00	Success	-	View
exp_self.20260424235135.581_20260424_235135 Paper: self.20260424235135.581	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424235135.581 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 23:52	Success	-	View
exp_self.20260424234407.580_20260424_234407 Paper: self.20260424234407.580	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424234407.580 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 23:45	Success	-	View
exp_self.20260424233646.579_20260424_233646 Paper: self.20260424233646.579	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424233646.579 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 23:37	Success	-	View
exp_pytrain.20260424233420.144_20260424_233420 Paper: pytrain.20260424233420.144	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 23:35	Success	-	View
exp_self.20260424232727.578_20260424_232728 Paper: self.20260424232727.578	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424232727.578 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 23:28	Success	-	View
exp_self.20260424232005.577_20260424_232005 Paper: self.20260424232005.577	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424232005.577 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 23:21	Success	-	View
exp_self.20260424231243.576_20260424_231243 Paper: self.20260424231243.576	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424231243.576 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 23:13	Success	-	View
exp_self.20260424230516.575_20260424_230517 Paper: self.20260424230516.575	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424230516.575 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 23:06	Success	-	View
exp_pytrain.20260424230256.143_20260424_230256 Paper: pytrain.20260424230256.143	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 23:03	Success	-	View
exp_self.20260424225604.574_20260424_225604 Paper: self.20260424225604.574	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424225604.574 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 22:57	Success	-	View
exp_self.20260424224843.573_20260424_224844 Paper: self.20260424224843.573	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424224843.573 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 22:49	Success	-	View
exp_self.20260424224119.572_20260424_224120 Paper: self.20260424224119.572	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424224119.572 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 22:42	Success	-	View
exp_self.20260424223358.571_20260424_223358 Paper: self.20260424223358.571	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424223358.571 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 22:35	Success	-	View
exp_pytrain.20260424223139.142_20260424_223139 Paper: pytrain.20260424223139.142	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 22:32	Success	-	View
exp_self.20260424222444.570_20260424_222445 Paper: self.20260424222444.570	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424222444.570 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 22:25	Success	-	View
exp_self.20260424221721.569_20260424_221721 Paper: self.20260424221721.569	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424221721.569 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 22:18	Success	-	View
exp_self.20260424220957.568_20260424_220958 Paper: self.20260424220957.568	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424220957.568 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 22:11	Success	-	View
exp_self.20260424220235.567_20260424_220235 Paper: self.20260424220235.567	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424220235.567 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 22:03	Success	-	View
exp_pytrain.20260424220015.141_20260424_220015 Paper: pytrain.20260424220015.141	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 22:01	Success	-	View
exp_self.20260424215320.566_20260424_215321 Paper: self.20260424215320.566	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424215320.566 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 21:54	Success	-	View
exp_self.20260424214602.565_20260424_214602 Paper: self.20260424214602.565	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424214602.565 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 21:47	Success	-	View
exp_self.20260424213840.564_20260424_213840 Paper: self.20260424213840.564	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424213840.564 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 21:39	Success	-	View
exp_self.20260424213113.563_20260424_213113 Paper: self.20260424213113.563	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424213113.563 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 21:32	Success	-	View
exp_pytrain.20260424212851.140_20260424_212851 Paper: pytrain.20260424212851.140	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 21:29	Success	-	View
exp_self.20260424212154.562_20260424_212155 Paper: self.20260424212154.562	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424212154.562 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 21:22	Success	-	View
exp_self.20260424211431.561_20260424_211432 Paper: self.20260424211431.561	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424211431.561 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 21:15	Success	-	View
exp_self.20260424210709.560_20260424_210709 Paper: self.20260424210709.560	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424210709.560 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 21:08	Success	-	View
exp_self.20260424205949.559_20260424_205949 Paper: self.20260424205949.559	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424205949.559 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 21:00	Success	-	View
exp_pytrain.20260424205728.139_20260424_205729 Paper: pytrain.20260424205728.139	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 20:58	Success	-	View
exp_self.20260424205033.558_20260424_205034 Paper: self.20260424205033.558	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424205033.558 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 20:51	Success	-	View
exp_self.20260424204313.557_20260424_204313 Paper: self.20260424204313.557	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424204313.557 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 20:44	Success	-	View
exp_self.20260424203550.556_20260424_203550 Paper: self.20260424203550.556	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424203550.556 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 20:36	Success	-	View
exp_self.20260424202823.555_20260424_202823 Paper: self.20260424202823.555	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424202823.555 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 20:29	Success	-	View
exp_pytrain.20260424202604.138_20260424_202604 Paper: pytrain.20260424202604.138	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 20:27	Success	-	View
exp_self.20260424201907.554_20260424_201907 Paper: self.20260424201907.554	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424201907.554 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 20:20	Success	-	View
exp_self.20260424201146.553_20260424_201146 Paper: self.20260424201146.553	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424201146.553 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 20:12	Success	-	View
exp_self.20260424200425.552_20260424_200426 Paper: self.20260424200425.552	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424200425.552 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 20:05	Success	-	View
exp_self.20260424195706.551_20260424_195706 Paper: self.20260424195706.551	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424195706.551 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 19:58	Success	-	View
exp_pytrain.20260424195439.137_20260424_195440 Paper: pytrain.20260424195439.137	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 19:55	Success	-	View
exp_self.20260424194748.550_20260424_194749 Paper: self.20260424194748.550	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424194748.550 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 19:48	Success	-	View
exp_self.20260424194026.549_20260424_194027 Paper: self.20260424194026.549	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424194026.549 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 19:41	Success	-	View
exp_self.20260424193305.548_20260424_193305 Paper: self.20260424193305.548	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424193305.548 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 19:34	Success	-	View
exp_self.20260424192544.547_20260424_192544 Paper: self.20260424192544.547	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424192544.547 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 19:26	Success	-	View
exp_pytrain.20260424192319.136_20260424_192320 Paper: pytrain.20260424192319.136	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 19:24	Success	-	View
exp_self.20260424191635.546_20260424_191635 Paper: self.20260424191635.546	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424191635.546 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 19:17	Success	-	View
exp_self.20260424190905.545_20260424_190906 Paper: self.20260424190905.545	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424190905.545 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 19:10	Success	-	View
exp_self.20260424190139.544_20260424_190140 Paper: self.20260424190139.544	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424190139.544 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 19:02	Success	-	View
exp_self.20260424185418.543_20260424_185418 Paper: self.20260424185418.543	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424185418.543 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 18:55	Success	-	View
exp_pytrain.20260424185157.135_20260424_185157 Paper: pytrain.20260424185157.135	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 18:52	Success	-	View
exp_self.20260424184503.542_20260424_184504 Paper: self.20260424184503.542	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424184503.542 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 18:46	Success	-	View
exp_self.20260424183743.541_20260424_183743 Paper: self.20260424183743.541	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424183743.541 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 18:38	Success	-	View
exp_self.20260424183016.540_20260424_183016 Paper: self.20260424183016.540	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424183016.540 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 18:31	Success	-	View
exp_self.20260424182250.539_20260424_182250 Paper: self.20260424182250.539	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424182250.539 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 18:23	Success	-	View
exp_pytrain.20260424182032.134_20260424_182032 Paper: pytrain.20260424182032.134	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 18:21	Success	-	View
exp_self.20260424181339.538_20260424_181340 Paper: self.20260424181339.538	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424181339.538 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 18:14	Success	-	View
exp_self.20260424180613.537_20260424_180613 Paper: self.20260424180613.537	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424180613.537 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 18:07	Success	-	View
exp_self.20260424175847.536_20260424_175848 Paper: self.20260424175847.536	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424175847.536 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 17:59	Success	-	View
exp_self.20260424175124.535_20260424_175124 Paper: self.20260424175124.535	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424175124.535 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 17:52	Success	-	View
exp_pytrain.20260424174905.133_20260424_174905 Paper: pytrain.20260424174905.133	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 17:50	Success	-	View
exp_self.20260424174211.534_20260424_174211 Paper: self.20260424174211.534	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424174211.534 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 17:43	Success	-	View
exp_self.20260424173448.533_20260424_173449 Paper: self.20260424173448.533	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424173448.533 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 17:35	Success	-	View
exp_self.20260424172728.532_20260424_172728 Paper: self.20260424172728.532	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424172728.532 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 17:28	Success	-	View
exp_self.20260424172006.531_20260424_172006 Paper: self.20260424172006.531	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424172006.531 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 17:21	Success	-	View
exp_pytrain.20260424171745.132_20260424_171745 Paper: pytrain.20260424171745.132	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 17:18	Success	-	View
exp_self.20260424171051.530_20260424_171051 Paper: self.20260424171051.530	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424171051.530 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 17:11	Success	-	View
exp_self.20260424170333.529_20260424_170333 Paper: self.20260424170333.529	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424170333.529 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 17:04	Success	-	View
exp_self.20260424165610.528_20260424_165611 Paper: self.20260424165610.528	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424165610.528 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 16:57	Success	-	View
exp_self.20260424164845.527_20260424_164846 Paper: self.20260424164845.527	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424164845.527 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 16:49	Success	-	View
exp_pytrain.20260424164619.131_20260424_164620 Paper: pytrain.20260424164619.131	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 16:47	Success	-	View
exp_self.20260424163919.526_20260424_163920 Paper: self.20260424163919.526	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424163919.526 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 16:40	Success	-	View
exp_self.20260424163151.525_20260424_163151 Paper: self.20260424163151.525	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424163151.525 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 16:32	Success	-	View
exp_self.20260424162431.524_20260424_162432 Paper: self.20260424162431.524	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424162431.524 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 16:25	Success	-	View
exp_self.20260424161708.523_20260424_161709 Paper: self.20260424161708.523	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424161708.523 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 16:18	Success	-	View
exp_pytrain.20260424161441.130_20260424_161441 Paper: pytrain.20260424161441.130	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 16:15	Success	-	View
exp_self.20260424160931.522_20260424_160932 Paper: self.20260424160931.522	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424160931.522 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 16:10	Success	-	View
exp_self.20260424160214.521_20260424_160214 Paper: self.20260424160214.521	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424160214.521 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 16:03	Success	-	View
exp_self.20260424155448.520_20260424_155449 Paper: self.20260424155448.520	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424155448.520 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 15:55	Success	-	View
exp_self.20260424154722.519_20260424_154723 Paper: self.20260424154722.519	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424154722.519 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 15:48	Success	-	View
exp_gh_vnmoorthy_pavo-bench_20260424_154439 Paper: gh_vnmoorthy_pavo-bench	vnmoorthy/pavo-bench Paper ID: gh_vnmoorthy_pavo-bench - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 15:45	Success	-	View
exp_pytrain.20260424154233.129_20260424_154233 Paper: pytrain.20260424154233.129	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 15:43	Success	-	View
exp_self.20260424153541.518_20260424_153542 Paper: self.20260424153541.518	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424153541.518 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 15:36	Success	-	View
exp_self.20260424152820.517_20260424_152820 Paper: self.20260424152820.517	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424152820.517 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 15:29	Success	-	View
exp_self.20260424152101.516_20260424_152101 Paper: self.20260424152101.516	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424152101.516 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 15:22	Success	-	View
exp_self.20260424151340.515_20260424_151340 Paper: self.20260424151340.515	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424151340.515 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 15:14	Success	-	View
exp_pytrain.20260424151114.128_20260424_151114 Paper: pytrain.20260424151114.128	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 15:12	Success	-	View
exp_self.20260424150423.514_20260424_150423 Paper: self.20260424150423.514	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424150423.514 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 15:05	Success	-	View
exp_self.20260424145658.513_20260424_145658 Paper: self.20260424145658.513	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424145658.513 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 14:58	Success	-	View
exp_self.20260424144935.512_20260424_144936 Paper: self.20260424144935.512	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424144935.512 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 14:50	Success	-	View
exp_self.20260424144218.511_20260424_144218 Paper: self.20260424144218.511	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424144218.511 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 14:43	Success	-	View
exp_pytrain.20260424143953.127_20260424_143953 Paper: pytrain.20260424143953.127	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 14:40	Success	-	View
exp_self.20260424143308.510_20260424_143308 Paper: self.20260424143308.510	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424143308.510 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 14:34	Success	-	View
exp_self.20260424142543.509_20260424_142544 Paper: self.20260424142543.509	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424142543.509 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 14:26	Success	-	View
exp_self.20260424141816.508_20260424_141816 Paper: self.20260424141816.508	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424141816.508 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 14:19	Success	-	View
exp_self.20260424141053.507_20260424_141053 Paper: self.20260424141053.507	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424141053.507 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 14:11	Success	-	View
exp_pytrain.20260424140833.126_20260424_140833 Paper: pytrain.20260424140833.126	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 14:09	Success	-	View
exp_self.20260424140139.506_20260424_140139 Paper: self.20260424140139.506	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424140139.506 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 14:02	Success	-	View
exp_self.20260424135418.505_20260424_135419 Paper: self.20260424135418.505	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424135418.505 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 13:55	Success	-	View
exp_self.20260424134657.504_20260424_134657 Paper: self.20260424134657.504	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424134657.504 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 13:47	Success	-	View
exp_self.20260424133932.503_20260424_133933 Paper: self.20260424133932.503	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424133932.503 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 13:40	Success	-	View
exp_pytrain.20260424133713.125_20260424_133714 Paper: pytrain.20260424133713.125	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 13:38	Success	-	View
exp_self.20260424133018.502_20260424_133019 Paper: self.20260424133018.502	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424133018.502 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 13:31	Success	-	View
exp_self.20260424132254.501_20260424_132254 Paper: self.20260424132254.501	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424132254.501 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 13:23	Success	-	View
exp_self.20260424131526.500_20260424_131526 Paper: self.20260424131526.500	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424131526.500 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 13:16	Success	-	View
exp_self.20260424130759.499_20260424_130759 Paper: self.20260424130759.499	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424130759.499 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 13:09	Success	-	View
exp_pytrain.20260424130541.124_20260424_130541 Paper: pytrain.20260424130541.124	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 13:06	Success	-	View
exp_self.20260424125849.498_20260424_125850 Paper: self.20260424125849.498	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424125849.498 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 12:59	Success	-	View
exp_self.20260424125129.497_20260424_125129 Paper: self.20260424125129.497	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424125129.497 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 12:52	Success	-	View
exp_hf_2604.20156_20260424_124815 Paper: hf_2604.20156	Temporally Extended Mixture-of-Experts Models Paper ID: hf_2604.20156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-24 12:49	Success	-	View
exp_self.20260424124252.496_20260424_124252 Paper: self.20260424124252.496	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424124252.496 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 12:43	Success	-	View
exp_self.20260424123524.495_20260424_123524 Paper: self.20260424123524.495	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424123524.495 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 12:36	Success	-	View
exp_pytrain.20260424123302.123_20260424_123303 Paper: pytrain.20260424123302.123	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 12:34	Success	-	View
exp_self.20260424122607.494_20260424_122607 Paper: self.20260424122607.494	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424122607.494 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 12:27	Success	-	View
exp_self.20260424121850.493_20260424_121850 Paper: self.20260424121850.493	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424121850.493 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 12:19	Success	-	View
exp_self.20260424121129.492_20260424_121129 Paper: self.20260424121129.492	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424121129.492 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 12:12	Success	-	View
exp_hf_2506.17001_20260424_120557 Paper: hf_2506.17001	PersonalAI: A Systematic Comparison of Knowledge Graph Storage and Retrieval Approaches for Personalized LLM agents Paper ID: hf_2506.17001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-24 12:06	Success	-	View
exp_self.20260424120400.491_20260424_120401 Paper: self.20260424120400.491	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424120400.491 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 12:05	Success	-	View
exp_pytrain.20260424120135.122_20260424_120135 Paper: pytrain.20260424120135.122	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 12:02	Success	-	View
exp_self.20260424115442.490_20260424_115443 Paper: self.20260424115442.490	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424115442.490 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 11:55	Success	-	View
exp_cr_10.1093_scipol_scag026_20260424_115020 Paper: cr_10.1093_scipol_scag026	Generative AI in public administration: evaluating a fine-tuned large language model for policy briefing notes Paper ID: cr_10.1093_scipol_scag026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovere...	04-24 11:51	Success	-	View
exp_self.20260424114713.489_20260424_114714 Paper: self.20260424114713.489	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424114713.489 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 11:48	Success	-	View
exp_self.20260424113950.488_20260424_113950 Paper: self.20260424113950.488	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424113950.488 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 11:40	Success	-	View
exp_self.20260424113134.487_20260424_113134 Paper: self.20260424113134.487	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424113134.487 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 11:32	Success	-	View
exp_pytrain.20260424112913.121_20260424_112913 Paper: pytrain.20260424112913.121	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 11:30	Success	-	View
exp_self.20260424112212.486_20260424_112213 Paper: self.20260424112212.486	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424112212.486 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 11:23	Success	-	View
exp_self.20260424111449.485_20260424_111449 Paper: self.20260424111449.485	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424111449.485 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 11:15	Success	-	View
exp_self.20260424110725.484_20260424_110725 Paper: self.20260424110725.484	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424110725.484 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 11:08	Success	-	View
exp_self.20260424105957.483_20260424_105957 Paper: self.20260424105957.483	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424105957.483 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 11:00	Success	-	View
exp_pytrain.20260424105733.120_20260424_105733 Paper: pytrain.20260424105733.120	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 10:58	Success	-	View
exp_self.20260424105037.482_20260424_105037 Paper: self.20260424105037.482	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424105037.482 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 10:51	Success	-	View
exp_self.20260424104314.481_20260424_104314 Paper: self.20260424104314.481	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424104314.481 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 10:44	Success	-	View
exp_self.20260424103553.480_20260424_103554 Paper: self.20260424103553.480	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424103553.480 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 10:36	Success	-	View
exp_self.20260424102824.479_20260424_102825 Paper: self.20260424102824.479	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424102824.479 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 10:29	Success	-	View
exp_pytrain.20260424102556.119_20260424_102556 Paper: pytrain.20260424102556.119	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 10:26	Success	-	View
exp_self.20260424101902.478_20260424_101902 Paper: self.20260424101902.478	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424101902.478 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 10:20	Success	-	View
exp_self.20260424101135.477_20260424_101135 Paper: self.20260424101135.477	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424101135.477 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 10:12	Success	-	View
exp_self.20260424100413.476_20260424_100413 Paper: self.20260424100413.476	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424100413.476 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 10:05	Success	-	View
exp_self.20260424095652.475_20260424_095652 Paper: self.20260424095652.475	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424095652.475 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 09:57	Success	-	View
exp_pytrain.20260424095422.118_20260424_095423 Paper: pytrain.20260424095422.118	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 09:55	Success	-	View
exp_hf_2604.21915_20260424_095140 Paper: hf_2604.21915	Vista4D: Video Reshooting with 4D Point Clouds Paper ID: hf_2604.21915 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-24 09:52	Success	-	View
exp_self.20260424094728.474_20260424_094729 Paper: self.20260424094728.474	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424094728.474 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 09:48	Success	-	View
exp_cr_10.3390_s26092643_20260424_094413 Paper: cr_10.3390_s26092643	Prediction of BDS-3 Satellite Clock Bias Based on the Mamba-LSTM Model Paper ID: cr_10.3390_s26092643 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered ben...	04-24 09:45	Success	-	View
exp_self.20260424093855.473_20260424_093855 Paper: self.20260424093855.473	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424093855.473 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 09:39	Success	-	View
exp_self.20260424093129.472_20260424_093130 Paper: self.20260424093129.472	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424093129.472 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 09:32	Success	-	View
exp_self.20260424092400.471_20260424_092401 Paper: self.20260424092400.471	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424092400.471 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 09:25	Success	-	View
exp_pytrain.20260424092142.117_20260424_092142 Paper: pytrain.20260424092142.117	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 09:22	Success	-	View
exp_self.20260424091448.470_20260424_091448 Paper: self.20260424091448.470	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424091448.470 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 09:15	Success	-	View
exp_self.20260424090726.469_20260424_090726 Paper: self.20260424090726.469	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424090726.469 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 09:08	Success	-	View
exp_self.20260424090001.468_20260424_090002 Paper: self.20260424090001.468	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424090001.468 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 09:01	Success	-	View
exp_self.20260424085232.467_20260424_085232 Paper: self.20260424085232.467	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424085232.467 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 08:53	Success	-	View
exp_pytrain.20260424085007.116_20260424_085007 Paper: pytrain.20260424085007.116	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 08:51	Success	-	View
exp_gh_Rianbajukendari_mini-infer_20260424_084754 Paper: gh_Rianbajukendari_mini-infer	Rianbajukendari/mini-infer Paper ID: gh_Rianbajukendari_mini-infer - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-24 08:48	Success	-	View
exp_self.20260424084158.466_20260424_084158 Paper: self.20260424084158.466	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424084158.466 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 08:43	Success	-	View
exp_self.20260424083437.465_20260424_083437 Paper: self.20260424083437.465	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424083437.465 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 08:35	Success	-	View
exp_self.20260424082716.464_20260424_082716 Paper: self.20260424082716.464	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424082716.464 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 08:28	Success	-	View
exp_self.20260424081950.463_20260424_081951 Paper: self.20260424081950.463	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424081950.463 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 08:20	Success	-	View
exp_pytrain.20260424081723.115_20260424_081723 Paper: pytrain.20260424081723.115	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 08:18	Success	-	View
exp_self.20260424081031.462_20260424_081031 Paper: self.20260424081031.462	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424081031.462 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 08:11	Success	-	View
exp_self.20260424080308.461_20260424_080308 Paper: self.20260424080308.461	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424080308.461 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 08:04	Success	-	View
exp_self.20260424075541.460_20260424_075542 Paper: self.20260424075541.460	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424075541.460 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 07:56	Success	-	View
exp_self.20260424074819.459_20260424_074819 Paper: self.20260424074819.459	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424074819.459 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 07:49	Success	-	View
exp_pytrain.20260424074551.114_20260424_074551 Paper: pytrain.20260424074551.114	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 07:46	Success	-	View
exp_self.20260424073904.458_20260424_073904 Paper: self.20260424073904.458	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424073904.458 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 07:40	Success	-	View
exp_self.20260424073135.457_20260424_073135 Paper: self.20260424073135.457	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424073135.457 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 07:32	Success	-	View
exp_self.20260424072413.456_20260424_072414 Paper: self.20260424072413.456	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424072413.456 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 07:25	Success	-	View
exp_self.20260424071649.455_20260424_071649 Paper: self.20260424071649.455	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424071649.455 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 07:17	Success	-	View
exp_pytrain.20260424071420.113_20260424_071420 Paper: pytrain.20260424071420.113	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 07:15	Success	-	View
exp_self.20260424070724.454_20260424_070724 Paper: self.20260424070724.454	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424070724.454 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 07:08	Success	-	View
exp_self.20260424065953.453_20260424_065954 Paper: self.20260424065953.453	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424065953.453 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 07:00	Success	-	View
exp_self.20260424065228.452_20260424_065228 Paper: self.20260424065228.452	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424065228.452 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 06:53	Success	-	View
exp_self.20260424064506.451_20260424_064506 Paper: self.20260424064506.451	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424064506.451 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 06:46	Success	-	View
exp_pytrain.20260424064241.112_20260424_064242 Paper: pytrain.20260424064241.112	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 06:43	Success	-	View
exp_self.20260424063555.450_20260424_063555 Paper: self.20260424063555.450	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424063555.450 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 06:36	Success	-	View
exp_self.20260424062825.449_20260424_062825 Paper: self.20260424062825.449	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424062825.449 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 06:29	Success	-	View
exp_self.20260424062056.448_20260424_062056 Paper: self.20260424062056.448	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424062056.448 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 06:21	Success	-	View
exp_self.20260424061333.447_20260424_061334 Paper: self.20260424061333.447	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424061333.447 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 06:14	Success	-	View
exp_pytrain.20260424061108.111_20260424_061109 Paper: pytrain.20260424061108.111	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 06:12	Success	-	View
exp_self.20260424060415.446_20260424_060415 Paper: self.20260424060415.446	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424060415.446 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 06:05	Success	-	View
exp_hf_2604.21668_20260424_055844 Paper: hf_2604.21668	Encoder-Free Human Motion Understanding via Structured Motion Descriptions Paper ID: hf_2604.21668 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-24 05:59	Success	-	View
exp_self.20260424055647.445_20260424_055648 Paper: self.20260424055647.445	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424055647.445 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 05:57	Success	-	View
exp_self.20260424054923.444_20260424_054924 Paper: self.20260424054923.444	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424054923.444 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 05:50	Success	-	View
exp_self.20260424054155.443_20260424_054156 Paper: self.20260424054155.443	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424054155.443 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 05:42	Success	-	View
exp_pytrain.20260424053931.110_20260424_053931 Paper: pytrain.20260424053931.110	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 05:40	Success	-	View
exp_self.20260424053237.442_20260424_053237 Paper: self.20260424053237.442	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424053237.442 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 05:33	Success	-	View
exp_self.20260424052513.441_20260424_052513 Paper: self.20260424052513.441	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424052513.441 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 05:26	Success	-	View
exp_self.20260424051751.440_20260424_051752 Paper: self.20260424051751.440	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424051751.440 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 05:18	Success	-	View
exp_self.20260424051029.439_20260424_051030 Paper: self.20260424051029.439	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424051029.439 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 05:11	Success	-	View
exp_pytrain.20260424050800.109_20260424_050801 Paper: pytrain.20260424050800.109	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 05:09	Success	-	View
exp_self.20260424050109.438_20260424_050109 Paper: self.20260424050109.438	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424050109.438 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 05:02	Success	-	View
exp_self.20260424045344.437_20260424_045344 Paper: self.20260424045344.437	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424045344.437 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 04:54	Success	-	View
exp_cr_10.38124_ijisrt_26apr950_20260424_044818 Paper: cr_10.38124_ijisrt_26apr950	Contextiva: An Integrated Framework Based on Agentic Retrieval Augmented Generation and Model Context Protocol for AI-As... Paper ID: cr_10.38124_ijisrt_26apr950 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recove...	04-24 04:49	Success	-	View
exp_self.20260424044615.436_20260424_044615 Paper: self.20260424044615.436	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424044615.436 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 04:47	Success	-	View
exp_self.20260424043845.435_20260424_043845 Paper: self.20260424043845.435	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424043845.435 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 04:39	Success	-	View
exp_pytrain.20260424043627.108_20260424_043627 Paper: pytrain.20260424043627.108	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 04:37	Success	-	View
exp_self.20260424042933.434_20260424_042933 Paper: self.20260424042933.434	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424042933.434 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 04:30	Success	-	View
exp_self.20260424042212.433_20260424_042212 Paper: self.20260424042212.433	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424042212.433 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 04:23	Success	-	View
exp_self.20260424041439.432_20260424_041439 Paper: self.20260424041439.432	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424041439.432 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 04:15	Success	-	View
exp_oa_W7155244741_20260424_040908 Paper: oa_W7155244741	Efficient Video Diffusion Models: Advancements and Challenges Paper ID: oa_W7155244741 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-24 04:10	Success	-	View
exp_self.20260424040713.431_20260424_040714 Paper: self.20260424040713.431	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424040713.431 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 04:08	Success	-	View
exp_pytrain.20260424040446.107_20260424_040446 Paper: pytrain.20260424040446.107	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 04:05	Success	-	View
exp_oa_W7155244458_20260424_040204 Paper: oa_W7155244458	Neural Garbage Collection: Learning to Forget while Learning to Reason Paper ID: oa_W7155244458 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-24 04:03	Success	-	View
exp_self.20260424035636.430_20260424_035636 Paper: self.20260424035636.430	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424035636.430 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 03:57	Success	-	View
exp_self.20260424034907.429_20260424_034908 Paper: self.20260424034907.429	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424034907.429 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 03:50	Success	-	View
exp_self.20260424034147.428_20260424_034148 Paper: self.20260424034147.428	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424034147.428 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 03:42	Success	-	View
exp_self.20260424033428.427_20260424_033428 Paper: self.20260424033428.427	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424033428.427 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 03:35	Success	-	View
exp_pytrain.20260424033201.106_20260424_033202 Paper: pytrain.20260424033201.106	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 03:33	Success	-	View
exp_self.20260424032515.426_20260424_032515 Paper: self.20260424032515.426	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424032515.426 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 03:26	Success	-	View
exp_self.20260424031744.425_20260424_031745 Paper: self.20260424031744.425	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424031744.425 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 03:18	Success	-	View
exp_self.20260424031021.424_20260424_031021 Paper: self.20260424031021.424	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424031021.424 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 03:11	Success	-	View
exp_self.20260424030259.423_20260424_030259 Paper: self.20260424030259.423	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424030259.423 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 03:04	Success	-	View
exp_pytrain.20260424030035.105_20260424_030035 Paper: pytrain.20260424030035.105	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 03:01	Success	-	View
exp_self.20260424025348.422_20260424_025349 Paper: self.20260424025348.422	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424025348.422 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 02:54	Success	-	View
exp_self.20260424024625.421_20260424_024625 Paper: self.20260424024625.421	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424024625.421 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 02:47	Success	-	View
exp_self.20260424023857.420_20260424_023858 Paper: self.20260424023857.420	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424023857.420 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 02:40	Success	-	View
exp_self.20260424023139.419_20260424_023140 Paper: self.20260424023139.419	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424023139.419 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 02:32	Success	-	View
exp_pytrain.20260424022920.104_20260424_022921 Paper: pytrain.20260424022920.104	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 02:30	Success	-	View
exp_self.20260424022400.418_20260424_022401 Paper: self.20260424022400.418	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424022400.418 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 02:25	Success	-	View
exp_self.20260424021639.417_20260424_021639 Paper: self.20260424021639.417	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424021639.417 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 02:17	Success	-	View
exp_self.20260424020919.416_20260424_020919 Paper: self.20260424020919.416	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424020919.416 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 02:10	Success	-	View
exp_self.20260424015946.415_20260424_015947 Paper: self.20260424015946.415	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424015946.415 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 02:00	Success	-	View
exp_pytrain.20260424015727.103_20260424_015727 Paper: pytrain.20260424015727.103	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 01:58	Success	-	View
exp_self.20260424015034.414_20260424_015034 Paper: self.20260424015034.414	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424015034.414 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 01:51	Success	-	View
exp_self.20260424014315.413_20260424_014315 Paper: self.20260424014315.413	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424014315.413 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 01:44	Success	-	View
exp_self.20260424013551.412_20260424_013551 Paper: self.20260424013551.412	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424013551.412 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 01:36	Success	-	View
exp_self.20260424012831.411_20260424_012832 Paper: self.20260424012831.411	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424012831.411 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 01:29	Success	-	View
exp_pytrain.20260424012613.102_20260424_012614 Paper: pytrain.20260424012613.102	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 01:27	Success	-	View
exp_self.20260424011927.410_20260424_011927 Paper: self.20260424011927.410	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424011927.410 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 01:20	Success	-	View
exp_self.20260424011208.409_20260424_011208 Paper: self.20260424011208.409	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424011208.409 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 01:13	Success	-	View
exp_self.20260424010447.408_20260424_010447 Paper: self.20260424010447.408	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424010447.408 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 01:05	Success	-	View
exp_self.20260424005654.407_20260424_005654 Paper: self.20260424005654.407	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424005654.407 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 00:57	Success	-	View
exp_pytrain.20260424005435.101_20260424_005435 Paper: pytrain.20260424005435.101	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 00:55	Success	-	View
exp_gh_Solar-cmd_neural-arithmetic-compression_20260424_004938 Paper: gh_Solar-cmd_neural-arithmetic-compression	Solar-cmd/neural-arithmetic-compression Paper ID: gh_Solar-cmd_neural-arithmetic-compression - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected...	04-24 00:50	Success	-	View
exp_self.20260424004739.406_20260424_004739 Paper: self.20260424004739.406	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424004739.406 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 00:48	Success	-	View
exp_self.20260424004015.405_20260424_004016 Paper: self.20260424004015.405	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424004015.405 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 00:41	Success	-	View
exp_self.20260424003259.404_20260424_003259 Paper: self.20260424003259.404	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424003259.404 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 00:34	Success	-	View
exp_self.20260424002543.403_20260424_002543 Paper: self.20260424002543.403	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424002543.403 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 00:26	Success	-	View
exp_pytrain.20260424002319.100_20260424_002320 Paper: pytrain.20260424002319.100	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-24 00:24	Success	-	View
exp_hf_2604.20398_20260424_002105 Paper: hf_2604.20398	WebGen-R1: Incentivizing Large Language Models to Generate Functional and Aesthetic Websites with Reinforcement Learning Paper ID: hf_2604.20398 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-24 00:22	Success	-	View
exp_self.20260424001759.402_20260424_001759 Paper: self.20260424001759.402	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424001759.402 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 00:19	Success	-	View
exp_self.20260424001039.401_20260424_001040 Paper: self.20260424001039.401	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424001039.401 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 00:11	Success	-	View
exp_hf_2604.20244_20260424_000622 Paper: hf_2604.20244	Hybrid Policy Distillation for LLMs Paper ID: hf_2604.20244 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-24 00:07	Success	-	View
exp_self.20260424000317.400_20260424_000317 Paper: self.20260424000317.400	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260424000317.400 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-24 00:04	Success	-	View
exp_self.20260423235553.399_20260423_235554 Paper: self.20260423235553.399	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423235553.399 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 23:56	Success	-	View
exp_pytrain.20260423235011.099_20260423_235011 Paper: pytrain.20260423235011.099	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 23:51	Success	-	View
exp_self.20260423234818.398_20260423_234818 Paper: self.20260423234818.398	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423234818.398 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 23:49	Success	-	View
exp_self.20260423234057.397_20260423_234058 Paper: self.20260423234057.397	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423234057.397 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 23:42	Success	-	View
exp_self.20260423233340.396_20260423_233340 Paper: self.20260423233340.396	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423233340.396 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 23:34	Success	-	View
exp_self.20260423232622.395_20260423_232622 Paper: self.20260423232622.395	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423232622.395 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 23:27	Success	-	View
exp_hf_2604.20987_20260423_232309 Paper: hf_2604.20987	Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks Paper ID: hf_2604.20987 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 23:24	Success	-	View
exp_pytrain.20260423231853.098_20260423_231853 Paper: pytrain.20260423231853.098	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 23:19	Success	-	View
exp_self.20260423231700.394_20260423_231701 Paper: self.20260423231700.394	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423231700.394 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 23:18	Success	-	View
exp_self.20260423230943.393_20260423_230944 Paper: self.20260423230943.393	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423230943.393 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 23:10	Success	-	View
exp_self.20260423230223.392_20260423_230223 Paper: self.20260423230223.392	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423230223.392 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 23:03	Success	-	View
exp_self.20260423225502.391_20260423_225502 Paper: self.20260423225502.391	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423225502.391 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 22:56	Success	-	View
exp_self.20260423224742.390_20260423_224742 Paper: self.20260423224742.390	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423224742.390 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 22:48	Success	-	View
exp_pytrain.20260423224525.097_20260423_224525 Paper: pytrain.20260423224525.097	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 22:46	Success	-	View
exp_self.20260423223833.389_20260423_223833 Paper: self.20260423223833.389	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423223833.389 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 22:39	Success	-	View
exp_self.20260423223116.388_20260423_223116 Paper: self.20260423223116.388	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423223116.388 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 22:32	Success	-	View
exp_hf_2604.21193_20260423_222801 Paper: hf_2604.21193	Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Langua... Paper ID: hf_2604.21193 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 22:29	Success	-	View
exp_hf_2604.21889_20260423_222436 Paper: hf_2604.21889	TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale Paper ID: hf_2604.21889 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 22:25	Success	-	View
exp_self.20260423222240.387_20260423_222240 Paper: self.20260423222240.387	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423222240.387 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 22:23	Success	-	View
exp_self.20260423221519.386_20260423_221519 Paper: self.20260423221519.386	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423221519.386 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 22:16	Success	-	View
exp_pytrain.20260423221251.096_20260423_221251 Paper: pytrain.20260423221251.096	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 22:13	Success	-	View
exp_self.20260423220600.385_20260423_220600 Paper: self.20260423220600.385	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423220600.385 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 22:07	Success	-	View
exp_self.20260423215835.384_20260423_215835 Paper: self.20260423215835.384	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423215835.384 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 21:59	Success	-	View
exp_self.20260423215114.383_20260423_215114 Paper: self.20260423215114.383	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423215114.383 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 21:52	Success	-	View
exp_self.20260423214356.382_20260423_214357 Paper: self.20260423214356.382	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423214356.382 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 21:44	Success	-	View
exp_pytrain.20260423214025.095_20260423_214026 Paper: pytrain.20260423214025.095	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 21:41	Success	-	View
exp_self.20260423213617.381_20260423_213617 Paper: self.20260423213617.381	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423213617.381 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 21:37	Success	-	View
exp_self.20260423212858.380_20260423_212859 Paper: self.20260423212858.380	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423212858.380 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 21:30	Success	-	View
exp_hf_2604.19734_20260423_212546 Paper: hf_2604.19734	UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling Paper ID: hf_2604.19734 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 21:26	Success	-	View
exp_self.20260423212134.379_20260423_212134 Paper: self.20260423212134.379	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423212134.379 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 21:22	Success	-	View
exp_self.20260423211425.378_20260423_211430 Paper: self.20260423211425.378	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423211425.378 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 21:15	Success	-	View
exp_pytrain.20260423210832.094_20260423_210835 Paper: pytrain.20260423210832.094	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 21:09	Success	-	View
exp_self.20260423210534.377_20260423_210538 Paper: self.20260423210534.377	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423210534.377 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 21:06	Success	-	View
exp_self.20260423205647.376_20260423_205649 Paper: self.20260423205647.376	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423205647.376 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 20:57	Success	-	View
exp_self.20260423204823.375_20260423_204829 Paper: self.20260423204823.375	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423204823.375 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 20:49	Success	-	View
exp_self.20260423203929.374_20260423_203936 Paper: self.20260423203929.374	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423203929.374 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 20:40	Success	-	View
exp_pytrain.20260423203416.093_20260423_203418 Paper: pytrain.20260423203416.093	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 20:35	Success	-	View
exp_self.20260423203128.373_20260423_203132 Paper: self.20260423203128.373	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423203128.373 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 20:32	Success	-	View
exp_self.20260423202218.372_20260423_202227 Paper: self.20260423202218.372	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423202218.372 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 20:23	Success	-	View
exp_self.20260423201343.371_20260423_201346 Paper: self.20260423201343.371	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423201343.371 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 20:14	Success	-	View
exp_2604.21816v1_20260423_200900 Paper: 2604.21816v1	Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalabl... Paper ID: 2604.21816v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-23 20:10	Success	-	View
exp_self.20260423200455.370_20260423_200458 Paper: self.20260423200455.370	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423200455.370 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 20:06	Success	-	View
exp_pytrain.20260423195941.092_20260423_195941 Paper: pytrain.20260423195941.092	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 20:00	Success	-	View
exp_self.20260423195641.369_20260423_195649 Paper: self.20260423195641.369	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423195641.369 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 19:57	Success	-	View
exp_self.20260423194749.368_20260423_194754 Paper: self.20260423194749.368	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423194749.368 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 19:48	Success	-	View
exp_hf_2604.20200_20260423_194311 Paper: hf_2604.20200	Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows Paper ID: hf_2604.20200 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 19:44	Success	-	View
exp_self.20260423193852.367_20260423_193858 Paper: self.20260423193852.367	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423193852.367 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 19:40	Success	-	View
exp_self.20260423193005.366_20260423_193008 Paper: self.20260423193005.366	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423193005.366 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 19:31	Success	-	View
exp_pytrain.20260423192419.091_20260423_192421 Paper: pytrain.20260423192419.091	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 19:25	Success	-	View
exp_self.20260423192135.365_20260423_192139 Paper: self.20260423192135.365	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423192135.365 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 19:22	Success	-	View
exp_self.20260423191213.364_20260423_191218 Paper: self.20260423191213.364	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423191213.364 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 19:13	Success	-	View
exp_self.20260423190323.363_20260423_190325 Paper: self.20260423190323.363	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423190323.363 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 19:04	Success	-	View
exp_self.20260423185444.362_20260423_185447 Paper: self.20260423185444.362	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423185444.362 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 18:55	Success	-	View
exp_pytrain.20260423184941.090_20260423_184945 Paper: pytrain.20260423184941.090	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 18:50	Success	-	View
exp_self.20260423184708.361_20260423_184710 Paper: self.20260423184708.361	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423184708.361 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 18:48	Success	-	View
exp_self.20260423183819.360_20260423_183822 Paper: self.20260423183819.360	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423183819.360 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 18:39	Success	-	View
exp_self.20260423182901.359_20260423_182904 Paper: self.20260423182901.359	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423182901.359 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 18:30	Success	-	View
exp_self.20260423182021.358_20260423_182022 Paper: self.20260423182021.358	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423182021.358 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 18:21	Success	-	View
exp_pytrain.20260423181516.089_20260423_181520 Paper: pytrain.20260423181516.089	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 18:16	Success	-	View
exp_self.20260423181226.357_20260423_181230 Paper: self.20260423181226.357	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423181226.357 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 18:13	Success	-	View
exp_self.20260423180403.356_20260423_180405 Paper: self.20260423180403.356	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423180403.356 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 18:05	Success	-	View
exp_self.20260423175618.355_20260423_175622 Paper: self.20260423175618.355	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423175618.355 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 17:57	Success	-	View
exp_self.20260423174735.354_20260423_174739 Paper: self.20260423174735.354	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423174735.354 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 17:48	Success	-	View
exp_pytrain.20260423174345.088_20260423_174350 Paper: pytrain.20260423174345.088	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 17:44	Success	-	View
exp_self.20260423173647.353_20260423_173650 Paper: self.20260423173647.353	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423173647.353 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 17:37	Success	-	View
exp_self.20260423172829.352_20260423_172830 Paper: self.20260423172829.352	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423172829.352 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 17:29	Success	-	View
exp_self.20260423172111.351_20260423_172112 Paper: self.20260423172111.351	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423172111.351 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 17:22	Success	-	View
exp_self.20260423171354.350_20260423_171354 Paper: self.20260423171354.350	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423171354.350 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 17:14	Success	-	View
exp_pytrain.20260423171127.087_20260423_171127 Paper: pytrain.20260423171127.087	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 17:12	Success	-	View
exp_self.20260423170717.349_20260423_170717 Paper: self.20260423170717.349	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423170717.349 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 17:08	Success	-	View
exp_self.20260423165955.348_20260423_165955 Paper: self.20260423165955.348	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423165955.348 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 17:00	Success	-	View
exp_self.20260423165243.347_20260423_165248 Paper: self.20260423165243.347	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423165243.347 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 16:53	Success	-	View
exp_self.20260423164445.346_20260423_164450 Paper: self.20260423164445.346	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423164445.346 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 16:45	Success	-	View
exp_pytrain.20260423163913.086_20260423_163919 Paper: pytrain.20260423163913.086	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 16:40	Success	-	View
exp_self.20260423163604.345_20260423_163608 Paper: self.20260423163604.345	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423163604.345 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 16:37	Success	-	View
exp_self.20260423162719.344_20260423_162719 Paper: self.20260423162719.344	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423162719.344 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 16:28	Success	-	View
exp_self.20260423162019.343_20260423_162023 Paper: self.20260423162019.343	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423162019.343 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 16:21	Success	-	View
exp_self.20260423161123.342_20260423_161126 Paper: self.20260423161123.342	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423161123.342 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 16:12	Success	-	View
exp_pytrain.20260423160738.085_20260423_160742 Paper: pytrain.20260423160738.085	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 16:08	Success	-	View
exp_self.20260423160023.341_20260423_160027 Paper: self.20260423160023.341	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423160023.341 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 16:01	Success	-	View
exp_self.20260423155107.340_20260423_155113 Paper: self.20260423155107.340	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423155107.340 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 15:52	Success	-	View
exp_self.20260423154226.339_20260423_154228 Paper: self.20260423154226.339	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423154226.339 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 15:43	Success	-	View
exp_cr_10.31449_inf.v50i11.9002_20260423_153854 Paper: cr_10.31449_inf.v50i11.9002	Hybrid LSTM-Transformer Model for Sequential and Context- Aware Tourism Destination Recommendation Paper ID: cr_10.31449_inf.v50i11.9002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recove...	04-23 15:39	Success	-	View
exp_pytrain.20260423153605.084_20260423_153613 Paper: pytrain.20260423153605.084	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 15:37	Success	-	View
exp_self.20260423153035.338_20260423_153036 Paper: self.20260423153035.338	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423153035.338 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 15:31	Success	-	View
exp_self.20260423152233.337_20260423_152236 Paper: self.20260423152233.337	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423152233.337 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 15:23	Success	-	View
exp_self.20260423151431.336_20260423_151432 Paper: self.20260423151431.336	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423151431.336 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 15:15	Success	-	View
exp_self.20260423150707.335_20260423_150707 Paper: self.20260423150707.335	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423150707.335 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 15:08	Success	-	View
exp_pytrain.20260423150434.083_20260423_150435 Paper: pytrain.20260423150434.083	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 15:05	Success	-	View
exp_self.20260423145918.334_20260423_145919 Paper: self.20260423145918.334	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423145918.334 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 15:00	Success	-	View
exp_self.20260423145152.333_20260423_145152 Paper: self.20260423145152.333	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423145152.333 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 14:52	Success	-	View
exp_self.20260423144426.332_20260423_144427 Paper: self.20260423144426.332	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423144426.332 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 14:45	Success	-	View
exp_self.20260423143648.331_20260423_143648 Paper: self.20260423143648.331	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423143648.331 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 14:37	Success	-	View
exp_pytrain.20260423143315.082_20260423_143315 Paper: pytrain.20260423143315.082	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 14:34	Success	-	View
exp_self.20260423142913.330_20260423_142914 Paper: self.20260423142913.330	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423142913.330 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 14:30	Success	-	View
exp_self.20260423142109.329_20260423_142109 Paper: self.20260423142109.329	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423142109.329 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 14:22	Success	-	View
exp_self.20260423141347.328_20260423_141348 Paper: self.20260423141347.328	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423141347.328 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 14:14	Success	-	View
exp_self.20260423140702.327_20260423_140704 Paper: self.20260423140702.327	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423140702.327 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 14:08	Success	-	View
exp_pytrain.20260423140131.081_20260423_140132 Paper: pytrain.20260423140131.081	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 14:02	Success	-	View
exp_self.20260423135925.326_20260423_135930 Paper: self.20260423135925.326	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423135925.326 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 14:00	Success	-	View
exp_self.20260423135101.325_20260423_135103 Paper: self.20260423135101.325	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423135101.325 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 13:52	Success	-	View
exp_self.20260423134229.324_20260423_134229 Paper: self.20260423134229.324	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423134229.324 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 13:43	Success	-	View
exp_self.20260423133519.323_20260423_133521 Paper: self.20260423133519.323	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423133519.323 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 13:36	Success	-	View
exp_pytrain.20260423132946.080_20260423_132948 Paper: pytrain.20260423132946.080	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 13:30	Success	-	View
exp_self.20260423132733.322_20260423_132733 Paper: self.20260423132733.322	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423132733.322 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 13:28	Success	-	View
exp_hf_2604.19835_20260423_132432 Paper: hf_2604.19835	Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts Paper ID: hf_2604.19835 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 13:25	Success	-	View
exp_self.20260423131707.321_20260423_131709 Paper: self.20260423131707.321	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423131707.321 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 13:18	Success	-	View
exp_self.20260423130842.320_20260423_130842 Paper: self.20260423130842.320	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423130842.320 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 13:09	Success	-	View
exp_self.20260423130127.319_20260423_130127 Paper: self.20260423130127.319	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423130127.319 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 13:02	Success	-	View
exp_pytrain.20260423125758.079_20260423_125758 Paper: pytrain.20260423125758.079	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 12:59	Success	-	View
exp_self.20260423125450.318_20260423_125450 Paper: self.20260423125450.318	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423125450.318 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 12:55	Success	-	View
exp_self.20260423124733.317_20260423_124734 Paper: self.20260423124733.317	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423124733.317 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 12:48	Success	-	View
exp_cr_10.3390_s26092616_20260423_124206 Paper: cr_10.3390_s26092616	MSW-Mamba-Det: Multi-Scale Windowed State-Space Modeling for End-to-End Defect Detection in Photovoltaic Module Electrol... Paper ID: cr_10.3390_s26092616 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered ben...	04-23 12:43	Success	-	View
exp_self.20260423124008.316_20260423_124008 Paper: self.20260423124008.316	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423124008.316 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 12:41	Success	-	View
exp_self.20260423123248.315_20260423_123248 Paper: self.20260423123248.315	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423123248.315 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 12:33	Success	-	View
exp_pytrain.20260423122638.078_20260423_122638 Paper: pytrain.20260423122638.078	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 12:27	Success	-	View
exp_self.20260423122445.314_20260423_122446 Paper: self.20260423122445.314	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423122445.314 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 12:25	Success	-	View
exp_self.20260423121729.313_20260423_121730 Paper: self.20260423121729.313	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423121729.313 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 12:18	Success	-	View
exp_self.20260423121008.312_20260423_121008 Paper: self.20260423121008.312	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423121008.312 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 12:11	Success	-	View
exp_self.20260423120248.311_20260423_120248 Paper: self.20260423120248.311	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423120248.311 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 12:03	Success	-	View
exp_self.20260423115529.310_20260423_115529 Paper: self.20260423115529.310	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423115529.310 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 11:56	Success	-	View
exp_pytrain.20260423115311.077_20260423_115312 Paper: pytrain.20260423115311.077	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 11:54	Success	-	View
exp_self.20260423114752.309_20260423_114752 Paper: self.20260423114752.309	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423114752.309 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 11:48	Success	-	View
exp_self.20260423114033.308_20260423_114033 Paper: self.20260423114033.308	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423114033.308 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 11:41	Success	-	View
exp_self.20260423113356.307_20260423_113357 Paper: self.20260423113356.307	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423113356.307 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 11:34	Success	-	View
exp_self.20260423112605.306_20260423_112606 Paper: self.20260423112605.306	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423112605.306 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 11:27	Success	-	View
exp_pytrain.20260423112150.076_20260423_112151 Paper: pytrain.20260423112150.076	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 11:22	Success	-	View
exp_self.20260423111851.305_20260423_111853 Paper: self.20260423111851.305	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423111851.305 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 11:19	Success	-	View
exp_self.20260423111127.304_20260423_111128 Paper: self.20260423111127.304	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423111127.304 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 11:12	Success	-	View
exp_hf_2604.20720_20260423_110510 Paper: hf_2604.20720	COMPASS: COntinual Multilingual PEFT with Adaptive Semantic Sampling Paper ID: hf_2604.20720 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 11:06	Success	-	View
exp_self.20260423110306.303_20260423_110306 Paper: self.20260423110306.303	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423110306.303 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 11:04	Success	-	View
exp_self.20260423105538.302_20260423_105541 Paper: self.20260423105538.302	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423105538.302 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 10:56	Success	-	View
exp_pytrain.20260423105034.075_20260423_105034 Paper: pytrain.20260423105034.075	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 10:51	Success	-	View
exp_self.20260423104822.301_20260423_104823 Paper: self.20260423104822.301	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423104822.301 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 10:49	Success	-	View
exp_self.20260423104116.300_20260423_104117 Paper: self.20260423104116.300	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423104116.300 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 10:42	Success	-	View
exp_self.20260423103357.299_20260423_103357 Paper: self.20260423103357.299	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423103357.299 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 10:34	Success	-	View
exp_self.20260423102622.298_20260423_102623 Paper: self.20260423102622.298	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423102622.298 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 10:27	Success	-	View
exp_pytrain.20260423101601.074_20260423_101802 Paper: pytrain.20260423101601.074	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 10:19	Success	-	View
exp_self.20260423095646.297_20260423_095648 Paper: self.20260423095646.297	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423095646.297 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 09:57	Success	-	View
exp_hf_2604.18780_20260423_095248 Paper: hf_2604.18780	Streaming Structured Inference with Flash-SemiCRF Paper ID: hf_2604.18780 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 09:53	Success	-	View
exp_self.20260423094933.296_20260423_094933 Paper: self.20260423094933.296	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423094933.296 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 09:50	Success	-	View
exp_hf_2604.16659_20260423_094450 Paper: hf_2604.16659	Benign Fine-Tuning Breaks Safety Alignment in Audio LLMs Paper ID: hf_2604.16659 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 09:45	Success	-	View
exp_self.20260423094211.295_20260423_094213 Paper: self.20260423094211.295	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423094211.295 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 09:43	Success	-	View
exp_pytrain.20260423093908.073_20260423_093910 Paper: pytrain.20260423093908.073	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 09:40	Success	-	View
exp_self.20260423093432.294_20260423_093436 Paper: self.20260423093432.294	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423093432.294 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 09:35	Success	-	View
exp_self.20260423092722.293_20260423_092723 Paper: self.20260423092722.293	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423092722.293 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 09:28	Success	-	View
exp_hf_2604.15093_20260423_092417 Paper: hf_2604.15093	OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis Paper ID: hf_2604.15093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 09:25	Success	-	View
exp_self.20260423091752.292_20260423_091753 Paper: self.20260423091752.292	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423091752.292 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 09:18	Success	-	View
exp_self.20260423090952.291_20260423_090953 Paper: self.20260423090952.291	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423090952.291 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 09:10	Success	-	View
exp_pytrain.20260423090709.072_20260423_090709 Paper: pytrain.20260423090709.072	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 09:08	Success	-	View
exp_self.20260423090034.290_20260423_090034 Paper: self.20260423090034.290	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423090034.290 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 09:01	Success	-	View
exp_self.20260423085335.289_20260423_085335 Paper: self.20260423085335.289	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423085335.289 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 08:54	Success	-	View
exp_self.20260423084612.288_20260423_084612 Paper: self.20260423084612.288	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423084612.288 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 08:47	Success	-	View
exp_self.20260423083908.287_20260423_083911 Paper: self.20260423083908.287	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423083908.287 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 08:40	Success	-	View
exp_pytrain.20260423083552.071_20260423_083553 Paper: pytrain.20260423083552.071	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 08:36	Success	-	View
exp_self.20260423083000.286_20260423_083002 Paper: self.20260423083000.286	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423083000.286 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 08:31	Success	-	View
exp_self.20260423082237.285_20260423_082239 Paper: self.20260423082237.285	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423082237.285 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 08:23	Success	-	View
exp_self.20260423081454.284_20260423_081455 Paper: self.20260423081454.284	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423081454.284 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 08:15	Success	-	View
exp_cr_10.3390_agriculture16090927_20260423_081104 Paper: cr_10.3390_agriculture16090927	A Copula-Based Efficiency Effects Stochastic Frontier Model with Application to Government Programs in Thai Rice Farming Paper ID: cr_10.3390_agriculture16090927 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Rec...	04-23 08:12	Success	-	View
exp_self.20260423080734.283_20260423_080737 Paper: self.20260423080734.283	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423080734.283 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 08:08	Success	-	View
exp_pytrain.20260423080413.070_20260423_080415 Paper: pytrain.20260423080413.070	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 08:05	Success	-	View
exp_self.20260423075737.282_20260423_075738 Paper: self.20260423075737.282	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423075737.282 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 07:58	Success	-	View
exp_self.20260423075031.281_20260423_075032 Paper: self.20260423075031.281	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423075031.281 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 07:51	Success	-	View
exp_self.20260423074326.280_20260423_074328 Paper: self.20260423074326.280	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423074326.280 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 07:44	Success	-	View
exp_self.20260423073556.279_20260423_073558 Paper: self.20260423073556.279	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423073556.279 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 07:37	Success	-	View
exp_pytrain.20260423073235.069_20260423_073239 Paper: pytrain.20260423073235.069	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 07:33	Success	-	View
exp_self.20260423072616.278_20260423_072618 Paper: self.20260423072616.278	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423072616.278 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 07:27	Success	-	View
exp_self.20260423071835.277_20260423_071836 Paper: self.20260423071835.277	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423071835.277 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 07:19	Success	-	View
exp_self.20260423071105.276_20260423_071108 Paper: self.20260423071105.276	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423071105.276 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 07:12	Success	-	View
exp_self.20260423070343.275_20260423_070344 Paper: self.20260423070343.275	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423070343.275 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 07:04	Success	-	View
exp_pytrain.20260423070046.068_20260423_070048 Paper: pytrain.20260423070046.068	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 07:01	Success	-	View
exp_self.20260423065413.274_20260423_065415 Paper: self.20260423065413.274	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423065413.274 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 06:55	Success	-	View
exp_self.20260423064658.273_20260423_064700 Paper: self.20260423064658.273	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423064658.273 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 06:48	Success	-	View
exp_self.20260423063932.272_20260423_063934 Paper: self.20260423063932.272	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423063932.272 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 06:40	Success	-	View
exp_self.20260423063213.271_20260423_063217 Paper: self.20260423063213.271	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423063213.271 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 06:33	Success	-	View
exp_pytrain.20260423062858.067_20260423_062901 Paper: pytrain.20260423062858.067	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 06:30	Success	-	View
exp_self.20260423062219.270_20260423_062223 Paper: self.20260423062219.270	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423062219.270 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 06:23	Success	-	View
exp_self.20260423061456.269_20260423_061458 Paper: self.20260423061456.269	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423061456.269 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 06:16	Success	-	View
exp_self.20260423060726.268_20260423_060728 Paper: self.20260423060726.268	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423060726.268 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 06:08	Success	-	View
exp_self.20260423060002.267_20260423_060005 Paper: self.20260423060002.267	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423060002.267 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 06:01	Success	-	View
exp_pytrain.20260423055707.066_20260423_055708 Paper: pytrain.20260423055707.066	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 05:58	Success	-	View
exp_self.20260423055031.266_20260423_055034 Paper: self.20260423055031.266	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423055031.266 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 05:51	Success	-	View
exp_self.20260423054302.265_20260423_054304 Paper: self.20260423054302.265	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423054302.265 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 05:44	Success	-	View
exp_self.20260423053533.264_20260423_053537 Paper: self.20260423053533.264	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423053533.264 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 05:36	Success	-	View
exp_self.20260423052806.263_20260423_052807 Paper: self.20260423052806.263	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423052806.263 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 05:29	Success	-	View
exp_pytrain.20260423052511.065_20260423_052515 Paper: pytrain.20260423052511.065	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 05:26	Success	-	View
exp_self.20260423052020.262_20260423_052022 Paper: self.20260423052020.262	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423052020.262 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 05:21	Success	-	View
exp_self.20260423051255.261_20260423_051256 Paper: self.20260423051255.261	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423051255.261 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 05:13	Success	-	View
exp_self.20260423050522.260_20260423_050524 Paper: self.20260423050522.260	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423050522.260 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 05:06	Success	-	View
exp_self.20260423045750.259_20260423_045753 Paper: self.20260423045750.259	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423045750.259 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 04:58	Success	-	View
exp_pytrain.20260423045341.064_20260423_045343 Paper: pytrain.20260423045341.064	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 04:54	Success	-	View
exp_self.20260423045026.258_20260423_045028 Paper: self.20260423045026.258	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423045026.258 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 04:51	Success	-	View
exp_self.20260423044256.257_20260423_044257 Paper: self.20260423044256.257	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423044256.257 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 04:43	Success	-	View
exp_self.20260423043531.256_20260423_043532 Paper: self.20260423043531.256	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423043531.256 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 04:36	Success	-	View
exp_gh_Yigtwxx_Awesome-RAG-Production_20260423_043223 Paper: gh_Yigtwxx_Awesome-RAG-Production	Yigtwxx/Awesome-RAG-Production Paper ID: gh_Yigtwxx_Awesome-RAG-Production - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal:...	04-23 04:33	Success	-	View
exp_self.20260423042540.255_20260423_042541 Paper: self.20260423042540.255	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423042540.255 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 04:26	Success	-	View
exp_pytrain.20260423042220.063_20260423_042222 Paper: pytrain.20260423042220.063	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 04:23	Success	-	View
exp_self.20260423041717.254_20260423_041719 Paper: self.20260423041717.254	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423041717.254 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 04:18	Success	-	View
exp_self.20260423040956.253_20260423_040959 Paper: self.20260423040956.253	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423040956.253 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 04:11	Success	-	View
exp_self.20260423040245.252_20260423_040246 Paper: self.20260423040245.252	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423040245.252 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 04:03	Success	-	View
exp_hf_2604.19572_20260423_035835 Paper: hf_2604.19572	A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression Paper ID: hf_2604.19572 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 03:59	Success	-	View
exp_self.20260423035352.251_20260423_035354 Paper: self.20260423035352.251	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423035352.251 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 03:54	Success	-	View
exp_pytrain.20260423035057.062_20260423_035058 Paper: pytrain.20260423035057.062	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 03:52	Success	-	View
exp_self.20260423034435.250_20260423_034436 Paper: self.20260423034435.250	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423034435.250 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 03:45	Success	-	View
exp_self.20260423033640.249_20260423_033645 Paper: self.20260423033640.249	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423033640.249 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 03:37	Success	-	View
exp_self.20260423032911.248_20260423_032913 Paper: self.20260423032911.248	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423032911.248 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 03:30	Success	-	View
exp_self.20260423032141.247_20260423_032143 Paper: self.20260423032141.247	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423032141.247 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 03:22	Success	-	View
exp_pytrain.20260423031827.061_20260423_031831 Paper: pytrain.20260423031827.061	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 03:19	Success	-	View
exp_self.20260423031146.246_20260423_031148 Paper: self.20260423031146.246	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423031146.246 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 03:12	Success	-	View
exp_self.20260423030416.245_20260423_030417 Paper: self.20260423030416.245	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423030416.245 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 03:05	Success	-	View
exp_self.20260423025717.244_20260423_025718 Paper: self.20260423025717.244	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423025717.244 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 02:58	Success	-	View
exp_self.20260423024957.243_20260423_024958 Paper: self.20260423024957.243	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423024957.243 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 02:51	Success	-	View
exp_pytrain.20260423024539.060_20260423_024540 Paper: pytrain.20260423024539.060	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 02:46	Success	-	View
exp_self.20260423024150.242_20260423_024150 Paper: self.20260423024150.242	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423024150.242 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 02:42	Success	-	View
exp_self.20260423023428.241_20260423_023430 Paper: self.20260423023428.241	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423023428.241 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 02:35	Success	-	View
exp_self.20260423022702.240_20260423_022704 Paper: self.20260423022702.240	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423022702.240 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 02:28	Success	-	View
exp_hf_2604.18982_20260423_022354 Paper: hf_2604.18982	SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution Paper ID: hf_2604.18982 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 02:24	Success	-	View
exp_self.20260423021714.239_20260423_021715 Paper: self.20260423021714.239	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423021714.239 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 02:18	Success	-	View
exp_pytrain.20260423021406.059_20260423_021407 Paper: pytrain.20260423021406.059	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 02:15	Success	-	View
exp_self.20260423020920.238_20260423_020921 Paper: self.20260423020920.238	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423020920.238 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 02:10	Success	-	View
exp_self.20260423020155.237_20260423_020156 Paper: self.20260423020155.237	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423020155.237 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 02:02	Success	-	View
exp_self.20260423015433.236_20260423_015435 Paper: self.20260423015433.236	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423015433.236 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 01:55	Success	-	View
exp_gh_clareembattled960_turboQuantPlayground_20260423_015047 Paper: gh_clareembattled960_turboQuantPlayground	clareembattled960/turboQuantPlayground Paper ID: gh_clareembattled960_turboQuantPlayground - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected...	04-23 01:51	Success	-	View
exp_hf_2604.16529_20260423_014821 Paper: hf_2604.16529	Scaling Test-Time Compute for Agentic Coding Paper ID: hf_2604.16529 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-23 01:49	Success	-	View
exp_self.20260423014555.235_20260423_014558 Paper: self.20260423014555.235	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423014555.235 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 01:47	Success	-	View
exp_pytrain.20260423014244.058_20260423_014245 Paper: pytrain.20260423014244.058	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 01:43	Success	-	View
exp_self.20260423013807.234_20260423_013809 Paper: self.20260423013807.234	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423013807.234 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 01:39	Success	-	View
exp_self.20260423013057.233_20260423_013059 Paper: self.20260423013057.233	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423013057.233 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 01:32	Success	-	View
exp_self.20260423012332.232_20260423_012332 Paper: self.20260423012332.232	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423012332.232 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 01:24	Success	-	View
exp_self.20260423011620.231_20260423_011623 Paper: self.20260423011620.231	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423011620.231 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 01:17	Success	-	View
exp_pytrain.20260423011101.057_20260423_011102 Paper: pytrain.20260423011101.057	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 01:12	Success	-	View
exp_self.20260423010849.230_20260423_010851 Paper: self.20260423010849.230	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423010849.230 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 01:09	Success	-	View
exp_self.20260423010128.229_20260423_010132 Paper: self.20260423010128.229	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423010128.229 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 01:02	Success	-	View
exp_self.20260423005406.228_20260423_005408 Paper: self.20260423005406.228	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423005406.228 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 00:55	Success	-	View
exp_self.20260423004706.227_20260423_004709 Paper: self.20260423004706.227	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423004706.227 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 00:48	Success	-	View
exp_self.20260423003953.226_20260423_003954 Paper: self.20260423003953.226	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423003953.226 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 00:40	Success	-	View
exp_pytrain.20260423003700.056_20260423_003702 Paper: pytrain.20260423003700.056	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 00:38	Success	-	View
exp_self.20260423003034.225_20260423_003036 Paper: self.20260423003034.225	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423003034.225 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 00:31	Success	-	View
exp_self.20260423002316.224_20260423_002318 Paper: self.20260423002316.224	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423002316.224 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 00:24	Success	-	View
exp_self.20260423001548.223_20260423_001551 Paper: self.20260423001548.223	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423001548.223 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 00:16	Success	-	View
exp_self.20260423000804.222_20260423_000806 Paper: self.20260423000804.222	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260423000804.222 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-23 00:09	Success	-	View
exp_pytrain.20260423000434.055_20260423_000437 Paper: pytrain.20260423000434.055	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-23 00:05	Success	-	View
exp_self.20260422235756.221_20260422_235757 Paper: self.20260422235756.221	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422235756.221 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 23:58	Success	-	View
exp_self.20260422235029.220_20260422_235030 Paper: self.20260422235029.220	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422235029.220 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 23:51	Success	-	View
exp_self.20260422234309.219_20260422_234311 Paper: self.20260422234309.219	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422234309.219 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 23:44	Success	-	View
exp_self.20260422233552.218_20260422_233553 Paper: self.20260422233552.218	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422233552.218 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 23:36	Success	-	View
exp_pytrain.20260422233247.054_20260422_233250 Paper: pytrain.20260422233247.054	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 23:33	Success	-	View
exp_self.20260422232628.217_20260422_232629 Paper: self.20260422232628.217	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422232628.217 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 23:27	Success	-	View
exp_self.20260422231903.216_20260422_231904 Paper: self.20260422231903.216	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422231903.216 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 23:20	Success	-	View
exp_self.20260422231101.215_20260422_231102 Paper: self.20260422231101.215	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422231101.215 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 23:12	Success	-	View
exp_self.20260422230345.214_20260422_230346 Paper: self.20260422230345.214	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422230345.214 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 23:04	Success	-	View
exp_pytrain.20260422230121.053_20260422_230122 Paper: pytrain.20260422230121.053	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 23:02	Success	-	View
exp_hf_2604.20570_20260422_225907 Paper: hf_2604.20570	Exploring Spatial Intelligence from a Generative Perspective Paper ID: hf_2604.20570 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 23:00	Success	-	View
exp_self.20260422225604.213_20260422_225604 Paper: self.20260422225604.213	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422225604.213 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 22:57	Success	-	View
exp_self.20260422224850.212_20260422_224850 Paper: self.20260422224850.212	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422224850.212 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 22:49	Success	-	View
exp_self.20260422224136.211_20260422_224136 Paper: self.20260422224136.211	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422224136.211 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 22:42	Success	-	View
exp_self.20260422223420.210_20260422_223420 Paper: self.20260422223420.210	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422223420.210 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 22:35	Success	-	View
exp_pytrain.20260422222837.052_20260422_222837 Paper: pytrain.20260422222837.052	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 22:29	Success	-	View
exp_self.20260422222645.209_20260422_222646 Paper: self.20260422222645.209	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422222645.209 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 22:27	Success	-	View
exp_self.20260422221930.208_20260422_221930 Paper: self.20260422221930.208	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422221930.208 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 22:20	Success	-	View
exp_hf_2604.20817_20260422_221402 Paper: hf_2604.20817	Convergent Evolution: How Different Language Models Learn Similar Number Representations Paper ID: hf_2604.20817 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 22:15	Success	-	View
exp_self.20260422221208.207_20260422_221208 Paper: self.20260422221208.207	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422221208.207 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 22:13	Success	-	View
exp_self.20260422220451.206_20260422_220452 Paper: self.20260422220451.206	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422220451.206 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 22:05	Success	-	View
exp_self.20260422215731.205_20260422_215732 Paper: self.20260422215731.205	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422215731.205 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 21:58	Success	-	View
exp_pytrain.20260422215512.051_20260422_215513 Paper: pytrain.20260422215512.051	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 21:56	Success	-	View
exp_2604.20842v1_20260422_215259 Paper: 2604.20842v1	SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation Paper ID: 2604.20842v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-22 21:54	Success	-	View
exp_self.20260422214954.204_20260422_214955 Paper: self.20260422214954.204	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422214954.204 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 21:50	Success	-	View
exp_self.20260422214236.203_20260422_214236 Paper: self.20260422214236.203	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422214236.203 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 21:43	Success	-	View
exp_hf_2604.14932_20260422_213919 Paper: hf_2604.14932	WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training Paper ID: hf_2604.14932 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 21:40	Success	-	View
exp_self.20260422213506.202_20260422_213506 Paper: self.20260422213506.202	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422213506.202 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 21:36	Success	-	View
exp_self.20260422212745.201_20260422_212746 Paper: self.20260422212745.201	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422212745.201 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 21:28	Success	-	View
exp_pytrain.20260422212202.050_20260422_212202 Paper: pytrain.20260422212202.050	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 21:23	Success	-	View
exp_self.20260422212011.200_20260422_212011 Paper: self.20260422212011.200	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422212011.200 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 21:21	Success	-	View
exp_gh_SonySemiconductorSolutions_mct-model-optimization_20260422_211728 Paper: gh_SonySemiconductorSolutions_mct-model-optimization	SonySemiconductorSolutions/mct-model-optimization Paper ID: gh_SonySemiconductorSolutions_mct-model-optimization - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry....	04-22 21:18	Success	-	View
exp_hf_2604.19902_20260422_211221 Paper: hf_2604.19902	MMCORE: MultiModal COnnection with Representation Aligned Latent Embeddings Paper ID: hf_2604.19902 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 21:13	Success	-	View
exp_self.20260422211028.199_20260422_211028 Paper: self.20260422211028.199	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422211028.199 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 21:11	Success	-	View
exp_hf_2604.20796_20260422_210713 Paper: hf_2604.20796	LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper ID: hf_2604.20796 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 21:08	Success	-	View
exp_2604.20688v1_20260422_210456 Paper: 2604.20688v1	Storm Surge Modeling, Bias Correction, Graph Neural Networks, Graph Convolution Networks Paper ID: 2604.20688v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-22 21:05	Success	-	View
exp_self.20260422210257.198_20260422_210258 Paper: self.20260422210257.198	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422210257.198 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 21:04	Success	-	View
exp_2604.20682v1_20260422_205945 Paper: 2604.20682v1	Variance Is Not Importance: Structural Analysis of Transformer Compressibility Across Model Scales Paper ID: 2604.20682v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-22 21:00	Success	-	View
exp_self.20260422205537.197_20260422_205537 Paper: self.20260422205537.197	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422205537.197 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 20:56	Success	-	View
exp_pytrain.20260422204924.049_20260422_204924 Paper: pytrain.20260422204924.049	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 20:50	Success	-	View
exp_self.20260422204733.196_20260422_204734 Paper: self.20260422204733.196	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422204733.196 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 20:48	Success	-	View
exp_self.20260422204019.195_20260422_204020 Paper: self.20260422204019.195	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422204019.195 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 20:41	Success	-	View
exp_self.20260422203255.194_20260422_203256 Paper: self.20260422203255.194	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422203255.194 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 20:33	Success	-	View
exp_self.20260422202539.193_20260422_202540 Paper: self.20260422202539.193	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422202539.193 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 20:26	Success	-	View
exp_self.20260422201825.192_20260422_201825 Paper: self.20260422201825.192	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422201825.192 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 20:19	Success	-	View
exp_pytrain.20260422201606.048_20260422_201607 Paper: pytrain.20260422201606.048	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 20:17	Success	-	View
exp_self.20260422200920.191_20260422_200921 Paper: self.20260422200920.191	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422200920.191 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 20:10	Success	-	View
exp_hf_2604.15664_20260422_200607 Paper: hf_2604.15664	Stargazer: A Scalable Model-Fitting Benchmark Environment for AI Agents under Astrophysical Constraints Paper ID: hf_2604.15664 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 20:07	Success	-	View
exp_self.20260422200054.190_20260422_200054 Paper: self.20260422200054.190	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422200054.190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 20:01	Success	-	View
exp_self.20260422195334.189_20260422_195334 Paper: self.20260422195334.189	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422195334.189 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 19:54	Success	-	View
exp_self.20260422194617.188_20260422_194618 Paper: self.20260422194617.188	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422194617.188 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 19:47	Success	-	View
exp_pytrain.20260422194400.047_20260422_194401 Paper: pytrain.20260422194400.047	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 19:45	Success	-	View
exp_self.20260422193715.187_20260422_193716 Paper: self.20260422193715.187	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422193715.187 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 19:38	Success	-	View
exp_self.20260422193001.186_20260422_193001 Paper: self.20260422193001.186	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422193001.186 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 19:31	Success	-	View
exp_self.20260422192241.185_20260422_192241 Paper: self.20260422192241.185	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422192241.185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 19:23	Success	-	View
exp_self.20260422191522.184_20260422_191523 Paper: self.20260422191522.184	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422191522.184 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 19:16	Success	-	View
exp_pytrain.20260422191154.046_20260422_191155 Paper: pytrain.20260422191154.046	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 19:12	Success	-	View
exp_self.20260422190748.183_20260422_190749 Paper: self.20260422190748.183	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422190748.183 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 19:08	Success	-	View
exp_self.20260422190030.182_20260422_190031 Paper: self.20260422190030.182	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422190030.182 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 19:01	Success	-	View
exp_self.20260422185315.181_20260422_185316 Paper: self.20260422185315.181	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422185315.181 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 18:54	Success	-	View
exp_self.20260422184559.180_20260422_184600 Paper: self.20260422184559.180	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422184559.180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 18:47	Success	-	View
exp_pytrain.20260422184016.045_20260422_184016 Paper: pytrain.20260422184016.045	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 18:41	Success	-	View
exp_self.20260422183824.179_20260422_183824 Paper: self.20260422183824.179	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422183824.179 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 18:39	Success	-	View
exp_self.20260422183109.178_20260422_183110 Paper: self.20260422183109.178	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422183109.178 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 18:32	Success	-	View
exp_self.20260422182355.177_20260422_182355 Paper: self.20260422182355.177	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422182355.177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 18:24	Success	-	View
exp_self.20260422181634.176_20260422_181635 Paper: self.20260422181634.176	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422181634.176 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 18:17	Success	-	View
exp_self.20260422180918.175_20260422_180918 Paper: self.20260422180918.175	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422180918.175 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 18:10	Success	-	View
exp_pytrain.20260422180700.044_20260422_180700 Paper: pytrain.20260422180700.044	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 18:08	Success	-	View
exp_self.20260422180015.174_20260422_180016 Paper: self.20260422180015.174	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422180015.174 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 18:01	Success	-	View
exp_self.20260422175300.173_20260422_175300 Paper: self.20260422175300.173	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422175300.173 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 17:54	Success	-	View
exp_self.20260422174540.172_20260422_174540 Paper: self.20260422174540.172	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422174540.172 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 17:46	Success	-	View
exp_self.20260422173822.171_20260422_173822 Paper: self.20260422173822.171	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422173822.171 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 17:39	Success	-	View
exp_pytrain.20260422173454.043_20260422_173454 Paper: pytrain.20260422173454.043	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 17:35	Success	-	View
exp_self.20260422173049.170_20260422_173049 Paper: self.20260422173049.170	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422173049.170 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 17:31	Success	-	View
exp_self.20260422172332.169_20260422_172332 Paper: self.20260422172332.169	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422172332.169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 17:24	Success	-	View
exp_self.20260422171616.168_20260422_171617 Paper: self.20260422171616.168	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422171616.168 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 17:17	Success	-	View
exp_self.20260422170902.167_20260422_170902 Paper: self.20260422170902.167	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422170902.167 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 17:10	Success	-	View
exp_pytrain.20260422170318.042_20260422_170318 Paper: pytrain.20260422170318.042	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 17:04	Success	-	View
exp_self.20260422170127.166_20260422_170128 Paper: self.20260422170127.166	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422170127.166 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 17:02	Success	-	View
exp_self.20260422165413.165_20260422_165414 Paper: self.20260422165413.165	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422165413.165 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 16:55	Success	-	View
exp_self.20260422164659.164_20260422_164659 Paper: self.20260422164659.164	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422164659.164 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 16:48	Success	-	View
exp_self.20260422163937.163_20260422_163938 Paper: self.20260422163937.163	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422163937.163 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 16:40	Success	-	View
exp_self.20260422163221.162_20260422_163221 Paper: self.20260422163221.162	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422163221.162 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 16:33	Success	-	View
exp_pytrain.20260422163004.041_20260422_163004 Paper: pytrain.20260422163004.041	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 16:31	Success	-	View
exp_self.20260422162311.161_20260422_162312 Paper: self.20260422162311.161	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422162311.161 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 16:24	Success	-	View
exp_self.20260422161552.160_20260422_161552 Paper: self.20260422161552.160	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422161552.160 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 16:16	Success	-	View
exp_self.20260422160834.159_20260422_160835 Paper: self.20260422160834.159	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422160834.159 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 16:09	Success	-	View
exp_self.20260422160119.158_20260422_160120 Paper: self.20260422160119.158	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422160119.158 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 16:02	Success	-	View
exp_pytrain.20260422155849.040_20260422_155849 Paper: pytrain.20260422155849.040	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 15:59	Success	-	View
exp_self.20260422155159.157_20260422_155200 Paper: self.20260422155159.157	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422155159.157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 15:53	Success	-	View
exp_self.20260422154434.156_20260422_154434 Paper: self.20260422154434.156	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422154434.156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 15:45	Success	-	View
exp_self.20260422153706.155_20260422_153707 Paper: self.20260422153706.155	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422153706.155 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 15:38	Success	-	View
exp_self.20260422152932.154_20260422_152932 Paper: self.20260422152932.154	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422152932.154 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 15:30	Success	-	View
exp_pytrain.20260422152657.039_20260422_152658 Paper: pytrain.20260422152657.039	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 15:28	Success	-	View
exp_self.20260422151952.153_20260422_151953 Paper: self.20260422151952.153	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422151952.153 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 15:20	Success	-	View
exp_self.20260422151212.152_20260422_151213 Paper: self.20260422151212.152	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422151212.152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 15:13	Success	-	View
exp_self.20260422150437.151_20260422_150438 Paper: self.20260422150437.151	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422150437.151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 15:05	Success	-	View
exp_self.20260422145653.150_20260422_145653 Paper: self.20260422145653.150	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422145653.150 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 14:57	Success	-	View
exp_pytrain.20260422145407.038_20260422_145408 Paper: pytrain.20260422145407.038	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 14:55	Success	-	View
exp_self.20260422144825.149_20260422_144825 Paper: self.20260422144825.149	Self-directed benchmark: ssm_mamba strategy stress test Paper ID: self.20260422144825.149 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 14:49	Success	-	View
exp_self.20260422144045.148_20260422_144045 Paper: self.20260422144045.148	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422144045.148 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 14:41	Success	-	View
exp_self.20260422143259.147_20260422_143259 Paper: self.20260422143259.147	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422143259.147 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 14:34	Success	-	View
exp_self.20260422142514.146_20260422_142514 Paper: self.20260422142514.146	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422142514.146 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 14:26	Success	-	View
exp_pytrain.20260422142241.037_20260422_142241 Paper: pytrain.20260422142241.037	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 14:23	Success	-	View
exp_self.20260422141534.145_20260422_141535 Paper: self.20260422141534.145	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422141534.145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 14:16	Success	-	View
exp_self.20260422140759.144_20260422_140759 Paper: self.20260422140759.144	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422140759.144 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 14:09	Success	-	View
exp_self.20260422140016.143_20260422_140016 Paper: self.20260422140016.143	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422140016.143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 14:01	Success	-	View
exp_self.20260422135233.142_20260422_135233 Paper: self.20260422135233.142	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422135233.142 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 13:53	Success	-	View
exp_pytrain.20260422134959.036_20260422_135000 Paper: pytrain.20260422134959.036	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 13:51	Success	-	View
exp_self.20260422134402.141_20260422_134402 Paper: self.20260422134402.141	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422134402.141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 13:45	Success	-	View
exp_self.20260422133619.140_20260422_133619 Paper: self.20260422133619.140	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422133619.140 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 13:37	Success	-	View
exp_self.20260422132839.139_20260422_132839 Paper: self.20260422132839.139	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422132839.139 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 13:29	Success	-	View
exp_self.20260422132049.138_20260422_132050 Paper: self.20260422132049.138	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422132049.138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 13:21	Success	-	View
exp_pytrain.20260422131809.035_20260422_131809 Paper: pytrain.20260422131809.035	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 13:19	Success	-	View
exp_self.20260422131105.137_20260422_131105 Paper: self.20260422131105.137	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422131105.137 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 13:12	Success	-	View
exp_self.20260422130326.136_20260422_130326 Paper: self.20260422130326.136	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422130326.136 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 13:04	Success	-	View
exp_self.20260422125553.135_20260422_125553 Paper: self.20260422125553.135	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422125553.135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 12:56	Success	-	View
exp_self.20260422124820.134_20260422_124821 Paper: self.20260422124820.134	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422124820.134 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 12:49	Success	-	View
exp_pytrain.20260422124545.034_20260422_124545 Paper: pytrain.20260422124545.034	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 12:46	Success	-	View
exp_self.20260422123848.133_20260422_123848 Paper: self.20260422123848.133	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422123848.133 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 12:39	Success	-	View
exp_self.20260422123105.132_20260422_123105 Paper: self.20260422123105.132	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422123105.132 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 12:32	Success	-	View
exp_self.20260422122325.131_20260422_122326 Paper: self.20260422122325.131	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422122325.131 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 12:24	Success	-	View
exp_self.20260422121551.130_20260422_121552 Paper: self.20260422121551.130	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422121551.130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 12:16	Success	-	View
exp_pytrain.20260422121322.033_20260422_121322 Paper: pytrain.20260422121322.033	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 12:14	Success	-	View
exp_self.20260422120614.129_20260422_120614 Paper: self.20260422120614.129	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422120614.129 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 12:07	Success	-	View
exp_self.20260422115837.128_20260422_115837 Paper: self.20260422115837.128	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422115837.128 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 11:59	Success	-	View
exp_self.20260422115056.127_20260422_115057 Paper: self.20260422115056.127	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422115056.127 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 11:52	Success	-	View
exp_self.20260422114320.126_20260422_114320 Paper: self.20260422114320.126	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422114320.126 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 11:44	Success	-	View
exp_pytrain.20260422114049.032_20260422_114049 Paper: pytrain.20260422114049.032	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 11:41	Success	-	View
exp_self.20260422113341.125_20260422_113342 Paper: self.20260422113341.125	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422113341.125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 11:34	Success	-	View
exp_self.20260422112605.124_20260422_112606 Paper: self.20260422112605.124	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422112605.124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 11:27	Success	-	View
exp_self.20260422111822.123_20260422_111823 Paper: self.20260422111822.123	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422111822.123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 11:19	Success	-	View
exp_self.20260422111041.122_20260422_111041 Paper: self.20260422111041.122	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422111041.122 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 11:11	Success	-	View
exp_pytrain.20260422110808.031_20260422_110808 Paper: pytrain.20260422110808.031	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 11:09	Success	-	View
exp_self.20260422110056.121_20260422_110057 Paper: self.20260422110056.121	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422110056.121 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 11:02	Success	-	View
exp_self.20260422105318.120_20260422_105318 Paper: self.20260422105318.120	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422105318.120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 10:54	Success	-	View
exp_self.20260422104541.119_20260422_104541 Paper: self.20260422104541.119	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422104541.119 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 10:46	Success	-	View
exp_self.20260422103758.118_20260422_103759 Paper: self.20260422103758.118	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422103758.118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 10:39	Success	-	View
exp_pytrain.20260422103526.030_20260422_103526 Paper: pytrain.20260422103526.030	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 10:36	Success	-	View
exp_self.20260422103001.117_20260422_103002 Paper: self.20260422103001.117	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422103001.117 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 10:31	Success	-	View
exp_self.20260422102215.116_20260422_102215 Paper: self.20260422102215.116	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422102215.116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 10:23	Success	-	View
exp_self.20260422101433.115_20260422_101434 Paper: self.20260422101433.115	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422101433.115 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 10:15	Success	-	View
exp_self.20260422100645.114_20260422_100646 Paper: self.20260422100645.114	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422100645.114 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 10:07	Success	-	View
exp_pytrain.20260422100346.029_20260422_100346 Paper: pytrain.20260422100346.029	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 10:04	Success	-	View
exp_self.20260422095726.113_20260422_095726 Paper: self.20260422095726.113	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422095726.113 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 09:58	Success	-	View
exp_self.20260422094951.112_20260422_094951 Paper: self.20260422094951.112	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422094951.112 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 09:50	Success	-	View
exp_self.20260422094215.111_20260422_094216 Paper: self.20260422094215.111	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422094215.111 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 09:43	Success	-	View
exp_self.20260422093442.110_20260422_093443 Paper: self.20260422093442.110	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422093442.110 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 09:35	Success	-	View
exp_pytrain.20260422093216.028_20260422_093216 Paper: pytrain.20260422093216.028	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 09:33	Success	-	View
exp_self.20260422092508.109_20260422_092509 Paper: self.20260422092508.109	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422092508.109 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 09:26	Success	-	View
exp_self.20260422091733.108_20260422_091734 Paper: self.20260422091733.108	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422091733.108 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 09:18	Success	-	View
exp_self.20260422091014.107_20260422_091015 Paper: self.20260422091014.107	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422091014.107 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 09:11	Success	-	View
exp_self.20260422090235.106_20260422_090236 Paper: self.20260422090235.106	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422090235.106 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 09:03	Success	-	View
exp_pytrain.20260422085930.027_20260422_085931 Paper: pytrain.20260422085930.027	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 09:00	Success	-	View
exp_self.20260422085302.105_20260422_085303 Paper: self.20260422085302.105	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422085302.105 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 08:54	Success	-	View
exp_self.20260422084535.104_20260422_084537 Paper: self.20260422084535.104	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422084535.104 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 08:46	Success	-	View
exp_hf_2604.19642_20260422_084027 Paper: hf_2604.19642	Micro Language Models Enable Instant Responses Paper ID: hf_2604.19642 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 08:41	Success	-	View
exp_self.20260422083758.103_20260422_083759 Paper: self.20260422083758.103	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422083758.103 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 08:39	Success	-	View
exp_self.20260422083025.102_20260422_083028 Paper: self.20260422083025.102	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422083025.102 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 08:31	Success	-	View
exp_pytrain.20260422082712.026_20260422_082714 Paper: pytrain.20260422082712.026	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 08:28	Success	-	View
exp_self.20260422082033.101_20260422_082034 Paper: self.20260422082033.101	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422082033.101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 08:21	Success	-	View
exp_self.20260422081307.100_20260422_081309 Paper: self.20260422081307.100	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422081307.100 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 08:14	Success	-	View
exp_self.20260422080529.099_20260422_080531 Paper: self.20260422080529.099	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422080529.099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 08:06	Success	-	View
exp_self.20260422075750.098_20260422_075753 Paper: self.20260422075750.098	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422075750.098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 07:58	Success	-	View
exp_pytrain.20260422075430.025_20260422_075432 Paper: pytrain.20260422075430.025	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 07:55	Success	-	View
exp_self.20260422074938.097_20260422_074940 Paper: self.20260422074938.097	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422074938.097 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 07:50	Success	-	View
exp_self.20260422074224.096_20260422_074225 Paper: self.20260422074224.096	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422074224.096 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 07:43	Success	-	View
exp_self.20260422073451.095_20260422_073451 Paper: self.20260422073451.095	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422073451.095 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 07:35	Success	-	View
exp_self.20260422072630.094_20260422_072631 Paper: self.20260422072630.094	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422072630.094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 07:27	Success	-	View
exp_pytrain.20260422072220.024_20260422_072221 Paper: pytrain.20260422072220.024	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 07:23	Success	-	View
exp_self.20260422071830.093_20260422_071830 Paper: self.20260422071830.093	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422071830.093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 07:19	Success	-	View
exp_self.20260422071026.092_20260422_071028 Paper: self.20260422071026.092	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422071026.092 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 07:11	Success	-	View
exp_self.20260422070221.091_20260422_070222 Paper: self.20260422070221.091	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422070221.091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 07:03	Success	-	View
exp_self.20260422065440.090_20260422_065442 Paper: self.20260422065440.090	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422065440.090 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 06:55	Success	-	View
exp_pytrain.20260422065027.023_20260422_065028 Paper: pytrain.20260422065027.023	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 06:51	Success	-	View
exp_self.20260422064653.089_20260422_064654 Paper: self.20260422064653.089	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422064653.089 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 06:47	Success	-	View
exp_self.20260422063941.088_20260422_063943 Paper: self.20260422063941.088	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422063941.088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 06:40	Success	-	View
exp_self.20260422063144.087_20260422_063144 Paper: self.20260422063144.087	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422063144.087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 06:32	Success	-	View
exp_self.20260422062348.086_20260422_062348 Paper: self.20260422062348.086	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422062348.086 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 06:24	Success	-	View
exp_pytrain.20260422061859.022_20260422_061900 Paper: pytrain.20260422061859.022	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 06:20	Success	-	View
exp_self.20260422061703.085_20260422_061703 Paper: self.20260422061703.085	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422061703.085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 06:18	Success	-	View
exp_self.20260422061020.084_20260422_061020 Paper: self.20260422061020.084	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422061020.084 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 06:11	Success	-	View
exp_self.20260422060328.083_20260422_060329 Paper: self.20260422060328.083	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422060328.083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 06:04	Success	-	View
exp_self.20260422055647.082_20260422_055647 Paper: self.20260422055647.082	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422055647.082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 05:57	Success	-	View
exp_self.20260422054905.081_20260422_054907 Paper: self.20260422054905.081	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422054905.081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 05:50	Success	-	View
exp_pytrain.20260422054614.021_20260422_054614 Paper: pytrain.20260422054614.021	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 05:47	Success	-	View
exp_self.20260422053952.080_20260422_053954 Paper: self.20260422053952.080	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422053952.080 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 05:40	Success	-	View
exp_self.20260422053254.079_20260422_053254 Paper: self.20260422053254.079	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422053254.079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 05:33	Success	-	View
exp_self.20260422052528.078_20260422_052529 Paper: self.20260422052528.078	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422052528.078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 05:26	Success	-	View
exp_self.20260422051732.077_20260422_051732 Paper: self.20260422051732.077	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422051732.077 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 05:18	Success	-	View
exp_pytrain.20260422051430.020_20260422_051432 Paper: pytrain.20260422051430.020	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 05:15	Success	-	View
exp_self.20260422050955.076_20260422_050956 Paper: self.20260422050955.076	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422050955.076 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 05:10	Success	-	View
exp_self.20260422050136.075_20260422_050139 Paper: self.20260422050136.075	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422050136.075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 05:02	Success	-	View
exp_self.20260422045424.074_20260422_045424 Paper: self.20260422045424.074	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422045424.074 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 04:55	Success	-	View
exp_self.20260422044644.073_20260422_044644 Paper: self.20260422044644.073	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422044644.073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 04:47	Success	-	View
exp_pytrain.20260422044237.019_20260422_044238 Paper: pytrain.20260422044237.019	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 04:43	Success	-	View
exp_self.20260422043907.072_20260422_043907 Paper: self.20260422043907.072	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422043907.072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 04:40	Success	-	View
exp_self.20260422043208.071_20260422_043208 Paper: self.20260422043208.071	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422043208.071 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 04:33	Success	-	View
exp_self.20260422042443.070_20260422_042444 Paper: self.20260422042443.070	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422042443.070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 04:25	Success	-	View
exp_self.20260422041648.069_20260422_041657 Paper: self.20260422041648.069	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422041648.069 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 04:17	Success	-	View
exp_pytrain.20260422041052.018_20260422_041053 Paper: pytrain.20260422041052.018	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 04:11	Success	-	View
exp_self.20260422040739.068_20260422_040742 Paper: self.20260422040739.068	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422040739.068 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 04:08	Success	-	View
exp_self.20260422035911.067_20260422_035914 Paper: self.20260422035911.067	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422035911.067 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 04:00	Success	-	View
exp_self.20260422035056.066_20260422_035059 Paper: self.20260422035056.066	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422035056.066 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 03:52	Success	-	View
exp_self.20260422034159.065_20260422_034159 Paper: self.20260422034159.065	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422034159.065 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 03:43	Success	-	View
exp_pytrain.20260422033659.017_20260422_033659 Paper: pytrain.20260422033659.017	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 03:38	Success	-	View
exp_self.20260422033428.064_20260422_033432 Paper: self.20260422033428.064	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422033428.064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 03:35	Success	-	View
exp_self.20260422032625.063_20260422_032629 Paper: self.20260422032625.063	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422032625.063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 03:27	Success	-	View
exp_self.20260422031746.062_20260422_031746 Paper: self.20260422031746.062	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422031746.062 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 03:18	Success	-	View
exp_self.20260422030953.061_20260422_030953 Paper: self.20260422030953.061	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422030953.061 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 03:10	Success	-	View
exp_pytrain.20260422030453.016_20260422_030453 Paper: pytrain.20260422030453.016	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 03:05	Success	-	View
exp_self.20260422030232.060_20260422_030232 Paper: self.20260422030232.060	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422030232.060 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 03:03	Success	-	View
exp_hf_2604.19254_20260422_025934 Paper: hf_2604.19254	ShadowPEFT: Shadow Network for Parameter-Efficient Fine-Tuning Paper ID: hf_2604.19254 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 03:00	Success	-	View
exp_self.20260422025204.059_20260422_025205 Paper: self.20260422025204.059	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422025204.059 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 02:53	Success	-	View
exp_self.20260422024442.058_20260422_024444 Paper: self.20260422024442.058	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422024442.058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 02:45	Success	-	View
exp_cr_10.1017_rsm.2026.10094_20260422_024056 Paper: cr_10.1017_rsm.2026.10094	Large language model-based paper classification framework with key-insight extraction and confidence-weighted voting Paper ID: cr_10.1017_rsm.2026.10094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovere...	04-22 02:41	Success	-	View
exp_self.20260422023721.057_20260422_023721 Paper: self.20260422023721.057	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422023721.057 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 02:38	Success	-	View
exp_pytrain.20260422023318.015_20260422_023319 Paper: pytrain.20260422023318.015	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 02:34	Success	-	View
exp_self.20260422022949.056_20260422_022950 Paper: self.20260422022949.056	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422022949.056 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 02:30	Success	-	View
exp_self.20260422022216.055_20260422_022217 Paper: self.20260422022216.055	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422022216.055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 02:23	Success	-	View
exp_self.20260422021450.054_20260422_021451 Paper: self.20260422021450.054	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422021450.054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 02:15	Success	-	View
exp_self.20260422020719.053_20260422_020720 Paper: self.20260422020719.053	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422020719.053 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 02:08	Success	-	View
exp_hf_2604.17982_20260422_020405 Paper: hf_2604.17982	Mitigating Multimodal Hallucination via Phase-wise Self-reward Paper ID: hf_2604.17982 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 02:05	Success	-	View
exp_pytrain.20260422020155.014_20260422_020156 Paper: pytrain.20260422020155.014	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 02:02	Success	-	View
exp_hf_2604.16913_20260422_015746 Paper: hf_2604.16913	The Cognitive Penalty: Ablating System 1 and System 2 Reasoning in Edge-Native SLMs for Decentralized Consensus Paper ID: hf_2604.16913 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 01:58	Success	-	View
exp_self.20260422015527.052_20260422_015529 Paper: self.20260422015527.052	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422015527.052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 01:56	Success	-	View
exp_hf_2604.16054_20260422_015122 Paper: hf_2604.16054	Mind's Eye: A Benchmark of Visual Abstraction, Transformation and Composition for Multimodal LLMs Paper ID: hf_2604.16054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 01:52	Success	-	View
exp_self.20260422014802.051_20260422_014802 Paper: self.20260422014802.051	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422014802.051 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 01:49	Success	-	View
exp_self.20260422014102.050_20260422_014104 Paper: self.20260422014102.050	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422014102.050 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 01:42	Success	-	View
exp_self.20260422013313.049_20260422_013314 Paper: self.20260422013313.049	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422013313.049 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 01:34	Success	-	View
exp_pytrain.20260422013013.013_20260422_013014 Paper: pytrain.20260422013013.013	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 01:31	Success	-	View
exp_self.20260422012521.048_20260422_012521 Paper: self.20260422012521.048	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422012521.048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 01:26	Success	-	View
exp_self.20260422011831.047_20260422_011832 Paper: self.20260422011831.047	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422011831.047 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 01:19	Success	-	View
exp_self.20260422011104.046_20260422_011106 Paper: self.20260422011104.046	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422011104.046 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 01:12	Success	-	View
exp_self.20260422010233.045_20260422_010234 Paper: self.20260422010233.045	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422010233.045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 01:03	Success	-	View
exp_pytrain.20260422005828.012_20260422_005830 Paper: pytrain.20260422005828.012	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 00:59	Success	-	View
exp_self.20260422005526.044_20260422_005526 Paper: self.20260422005526.044	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422005526.044 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 00:56	Success	-	View
exp_self.20260422004705.043_20260422_004705 Paper: self.20260422004705.043	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422004705.043 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 00:48	Success	-	View
exp_cr_10.3389_fmed.2026.1819087_20260422_004148 Paper: cr_10.3389_fmed.2026.1819087	Examiner stratification reveals clinically relevant variability in large language model answers to endodontic patient qu... Paper ID: cr_10.3389_fmed.2026.1819087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recov...	04-22 00:42	Success	-	View
exp_self.20260422003907.042_20260422_003907 Paper: self.20260422003907.042	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422003907.042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 00:40	Success	-	View
exp_self.20260422003216.041_20260422_003218 Paper: self.20260422003216.041	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422003216.041 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 00:33	Success	-	View
exp_pytrain.20260422002657.011_20260422_002700 Paper: pytrain.20260422002657.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-22 00:28	Success	-	View
exp_self.20260422002421.040_20260422_002422 Paper: self.20260422002421.040	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422002421.040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 00:25	Success	-	View
exp_gh_NVIDIA_TransformerEngine_20260422_002107 Paper: gh_NVIDIA_TransformerEngine	NVIDIA/TransformerEngine Paper ID: gh_NVIDIA_TransformerEngine - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recove...	04-22 00:22	Success	-	View
exp_self.20260422001330.039_20260422_001331 Paper: self.20260422001330.039	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422001330.039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 00:14	Success	-	View
exp_self.20260422000620.038_20260422_000621 Paper: self.20260422000620.038	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260422000620.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-22 00:07	Success	-	View
exp_hf_2604.19747_20260422_000224 Paper: hf_2604.19747	AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model Paper ID: hf_2604.19747 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-22 00:03	Success	-	View
exp_self.20260421235647.037_20260421_235649 Paper: self.20260421235647.037	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421235647.037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 23:57	Success	-	View
exp_pytrain.20260421235222.010_20260421_235223 Paper: pytrain.20260421235222.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 23:53	Success	-	View
exp_self.20260421234843.036_20260421_234845 Paper: self.20260421234843.036	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421234843.036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 23:49	Success	-	View
exp_self.20260421234116.035_20260421_234116 Paper: self.20260421234116.035	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421234116.035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 23:42	Success	-	View
exp_self.20260421233419.034_20260421_233420 Paper: self.20260421233419.034	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421233419.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 23:35	Success	-	View
exp_self.20260421232632.033_20260421_232633 Paper: self.20260421232632.033	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421232632.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 23:27	Success	-	View
exp_pytrain.20260421232058.009_20260421_232059 Paper: pytrain.20260421232058.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 23:22	Success	-	View
exp_self.20260421231848.032_20260421_231849 Paper: self.20260421231848.032	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421231848.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 23:19	Success	-	View
exp_self.20260421231040.031_20260421_231040 Paper: self.20260421231040.031	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421231040.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 23:11	Success	-	View
exp_self.20260421230233.030_20260421_230241 Paper: self.20260421230233.030	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421230233.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 23:03	Success	-	View
exp_self.20260421225413.029_20260421_225415 Paper: self.20260421225413.029	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421225413.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 22:55	Success	-	View
exp_pytrain.20260421224817.008_20260421_224817 Paper: pytrain.20260421224817.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 22:49	Success	-	View
exp_self.20260421224605.028_20260421_224606 Paper: self.20260421224605.028	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421224605.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 22:47	Success	-	View
exp_self.20260421223808.027_20260421_223808 Paper: self.20260421223808.027	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421223808.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 22:39	Success	-	View
exp_self.20260421223019.026_20260421_223020 Paper: self.20260421223019.026	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421223019.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 22:31	Success	-	View
exp_self.20260421222220.025_20260421_222220 Paper: self.20260421222220.025	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421222220.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 22:23	Success	-	View
exp_hf_2604.17397_20260421_221911 Paper: hf_2604.17397	Speculative Decoding for Autoregressive Video Generation Paper ID: hf_2604.17397 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 22:20	Success	-	View
exp_pytrain.20260421221652.007_20260421_221654 Paper: pytrain.20260421221652.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 22:17	Success	-	View
exp_self.20260421221404.024_20260421_221405 Paper: self.20260421221404.024	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421221404.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 22:15	Success	-	View
exp_self.20260421220628.023_20260421_220629 Paper: self.20260421220628.023	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421220628.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 22:07	Success	-	View
exp_hf_2604.15706_20260421_220044 Paper: hf_2604.15706	Target-Oriented Pretraining Data Selection via Neuron-Activated Graph Paper ID: hf_2604.15706 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 22:01	Success	-	View
exp_self.20260421215833.022_20260421_215834 Paper: self.20260421215833.022	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421215833.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 21:59	Success	-	View
exp_2604.19748v1_20260421_215343 Paper: 2604.19748v1	Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items Paper ID: 2604.19748v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-21 21:54	Success	-	View
exp_self.20260421215046.021_20260421_215047 Paper: self.20260421215046.021	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421215046.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 21:51	Success	-	View
exp_2604.19747v1_20260421_214717 Paper: 2604.19747v1	AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model Paper ID: 2604.19747v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-21 21:48	Success	-	View
exp_pytrain.20260421214440.006_20260421_214441 Paper: pytrain.20260421214440.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 21:45	Success	-	View
exp_self.20260421214155.020_20260421_214155 Paper: self.20260421214155.020	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421214155.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 21:42	Success	-	View
exp_hf_2604.19636_20260421_213821 Paper: hf_2604.19636	CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation Paper ID: hf_2604.19636 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 21:39	Success	-	View
exp_self.20260421213314.019_20260421_213314 Paper: self.20260421213314.019	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421213314.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 21:34	Success	-	View
exp_self.20260421212522.018_20260421_212524 Paper: self.20260421212522.018	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421212522.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 21:26	Success	-	View
exp_hf_2604.19748_20260421_212050 Paper: hf_2604.19748	Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items Paper ID: hf_2604.19748 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 21:21	Success	-	View
exp_self.20260421211832.017_20260421_211833 Paper: self.20260421211832.017	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421211832.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 21:19	Success	-	View
exp_hf_2604.19550_20260421_211506 Paper: hf_2604.19550	LoopCTR: Unlocking the Loop Scaling Power for Click-Through Rate Prediction Paper ID: hf_2604.19550 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 21:16	Success	-	View
exp_pytrain.20260421211245.005_20260421_211247 Paper: pytrain.20260421211245.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 21:13	Success	-	View
exp_self.20260421211002.016_20260421_211003 Paper: self.20260421211002.016	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421211002.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 21:11	Success	-	View
exp_self.20260421210155.015_20260421_210158 Paper: self.20260421210155.015	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421210155.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 21:03	Success	-	View
exp_self.20260421205346.014_20260421_205347 Paper: self.20260421205346.014	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421205346.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 20:54	Success	-	View
exp_2604.19473v1_20260421_204906 Paper: 2604.19473v1	TS-Attn: Temporal-wise Separable Attention for Multi-Event Video Generation Paper ID: 2604.19473v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-21 20:50	Success	-	View
exp_self.20260421204635.013_20260421_204635 Paper: self.20260421204635.013	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421204635.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 20:47	Success	-	View
exp_2604.19464v1_20260421_204319 Paper: 2604.19464v1	LePREC: Reasoning as Classification over Structured Factors for Assessing Relevance of Legal Issues Paper ID: 2604.19464v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-21 20:44	Success	-	View
exp_pytrain.20260421204041.004_20260421_204041 Paper: pytrain.20260421204041.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 20:41	Success	-	View
exp_self.20260421203801.012_20260421_203803 Paper: self.20260421203801.012	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421203801.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 20:39	Success	-	View
exp_self.20260421202857.011_20260421_202858 Paper: self.20260421202857.011	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421202857.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 20:30	Success	-	View
exp_self.20260421202106.010_20260421_202106 Paper: self.20260421202106.010	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421202106.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 20:22	Success	-	View
exp_self.20260421201203.009_20260421_201205 Paper: self.20260421201203.009	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421201203.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 20:13	Success	-	View
exp_pytrain.20260421200910.003_20260421_200912 Paper: pytrain.20260421200910.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 20:10	Success	-	View
exp_self.20260421200155.008_20260421_200156 Paper: self.20260421200155.008	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421200155.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 20:02	Success	-	View
exp_self.20260421195422.007_20260421_195422 Paper: self.20260421195422.007	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421195422.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 19:55	Success	-	View
exp_self.20260421194716.006_20260421_194718 Paper: self.20260421194716.006	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421194716.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 19:48	Success	-	View
exp_self.20260421193916.005_20260421_193917 Paper: self.20260421193916.005	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421193916.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 19:40	Success	-	View
exp_pytrain.20260421193556.002_20260421_193559 Paper: pytrain.20260421193556.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 19:37	Success	-	View
exp_self.20260421192933.004_20260421_192933 Paper: self.20260421192933.004	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421192933.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 19:30	Success	-	View
exp_self.20260421192239.003_20260421_192240 Paper: self.20260421192239.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421192239.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 19:23	Success	-	View
exp_self.20260421191504.002_20260421_191507 Paper: self.20260421191504.002	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421191504.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 19:16	Success	-	View
exp_self.20260421190652.001_20260421_190652 Paper: self.20260421190652.001	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421190652.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 19:07	Success	-	View
exp_pytrain.20260421190343.001_20260421_190347 Paper: pytrain.20260421190343.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 19:04	Success	-	View
exp_self.20260421182628.001_20260421_182630 Paper: self.20260421182628.001	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421182628.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 18:27	Success	-	View
exp_pytrain.20260421182329.001_20260421_182332 Paper: pytrain.20260421182329.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 18:24	Success	-	View
exp_self.20260421181542.194_20260421_181544 Paper: self.20260421181542.194	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421181542.194 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 18:16	Success	-	View
exp_self.20260421180805.193_20260421_180810 Paper: self.20260421180805.193	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421180805.193 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 18:09	Success	-	View
exp_self.20260421180005.192_20260421_180009 Paper: self.20260421180005.192	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421180005.192 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 18:01	Success	-	View
exp_pytrain.20260421175600.049_20260421_175602 Paper: pytrain.20260421175600.049	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 17:57	Success	-	View
exp_self.20260421175129.191_20260421_175129 Paper: self.20260421175129.191	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421175129.191 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 17:52	Success	-	View
exp_self.20260421174330.190_20260421_174331 Paper: self.20260421174330.190	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421174330.190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 17:44	Success	-	View
exp_self.20260421173637.189_20260421_173637 Paper: self.20260421173637.189	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421173637.189 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 17:37	Success	-	View
exp_self.20260421172939.188_20260421_172949 Paper: self.20260421172939.188	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421172939.188 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 17:30	Success	-	View
exp_pytrain.20260421172439.048_20260421_172440 Paper: pytrain.20260421172439.048	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 17:25	Success	-	View
exp_self.20260421172230.187_20260421_172238 Paper: self.20260421172230.187	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421172230.187 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 17:23	Success	-	View
exp_self.20260421171334.186_20260421_171337 Paper: self.20260421171334.186	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421171334.186 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 17:14	Success	-	View
exp_self.20260421170536.185_20260421_170537 Paper: self.20260421170536.185	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421170536.185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 17:06	Success	-	View
exp_self.20260421165713.184_20260421_165717 Paper: self.20260421165713.184	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421165713.184 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 16:58	Success	-	View
exp_pytrain.20260421165115.047_20260421_165115 Paper: pytrain.20260421165115.047	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 16:52	Success	-	View
exp_self.20260421164857.183_20260421_164858 Paper: self.20260421164857.183	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421164857.183 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 16:50	Success	-	View
exp_self.20260421164218.182_20260421_164219 Paper: self.20260421164218.182	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421164218.182 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 16:43	Success	-	View
exp_self.20260421163425.181_20260421_163427 Paper: self.20260421163425.181	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421163425.181 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 16:35	Success	-	View
exp_self.20260421162533.180_20260421_162535 Paper: self.20260421162533.180	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421162533.180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 16:26	Success	-	View
exp_pytrain.20260421161954.046_20260421_161955 Paper: pytrain.20260421161954.046	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 16:20	Success	-	View
exp_self.20260421161737.179_20260421_161738 Paper: self.20260421161737.179	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421161737.179 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 16:18	Success	-	View
exp_self.20260421161016.178_20260421_161018 Paper: self.20260421161016.178	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421161016.178 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 16:11	Success	-	View
exp_self.20260421160220.177_20260421_160223 Paper: self.20260421160220.177	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421160220.177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 16:03	Success	-	View
exp_cr_10.2196_89540_20260421_155658 Paper: cr_10.2196_89540	Classifying American Society of Anesthesiologists Physical Status With a Low-Rank–Adapted Large Language Model: Developm... Paper ID: cr_10.2196_89540 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchma...	04-21 15:58	Success	-	View
exp_self.20260421155152.176_20260421_155154 Paper: self.20260421155152.176	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421155152.176 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 15:52	Success	-	View
exp_pytrain.20260421154821.045_20260421_154821 Paper: pytrain.20260421154821.045	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 15:49	Success	-	View
exp_self.20260421154216.175_20260421_154218 Paper: self.20260421154216.175	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421154216.175 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 15:43	Success	-	View
exp_hf_2604.18396_20260421_153805 Paper: hf_2604.18396	River-LLM: Large Language Model Seamless Exit Based on KV Share Paper ID: hf_2604.18396 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 15:39	Success	-	View
exp_self.20260421153413.174_20260421_153415 Paper: self.20260421153413.174	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421153413.174 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 15:35	Success	-	View
exp_self.20260421152648.173_20260421_152648 Paper: self.20260421152648.173	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421152648.173 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 15:27	Success	-	View
exp_self.20260421151922.172_20260421_151923 Paper: self.20260421151922.172	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421151922.172 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 15:20	Success	-	View
exp_pytrain.20260421151557.044_20260421_151558 Paper: pytrain.20260421151557.044	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 15:17	Success	-	View
exp_self.20260421151144.171_20260421_151146 Paper: self.20260421151144.171	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421151144.171 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 15:12	Success	-	View
exp_self.20260421145414.170_20260421_145414 Paper: self.20260421145414.170	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421145414.170 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 14:55	Success	-	View
exp_hf_2604.18267_20260421_144953 Paper: hf_2604.18267	MARCO: Navigating the Unseen Space of Semantic Correspondence Paper ID: hf_2604.18267 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 14:50	Success	-	View
exp_self.20260421144726.169_20260421_144730 Paper: self.20260421144726.169	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421144726.169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 14:48	Success	-	View
exp_pytrain.20260421144404.043_20260421_144406 Paper: pytrain.20260421144404.043	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 14:45	Success	-	View
exp_self.20260421143728.168_20260421_143730 Paper: self.20260421143728.168	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421143728.168 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 14:38	Success	-	View
exp_self.20260421143009.167_20260421_143011 Paper: self.20260421143009.167	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421143009.167 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 14:31	Success	-	View
exp_self.20260421142233.166_20260421_142235 Paper: self.20260421142233.166	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421142233.166 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 14:23	Success	-	View
exp_self.20260421141407.165_20260421_141407 Paper: self.20260421141407.165	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421141407.165 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 14:15	Success	-	View
exp_pytrain.20260421141150.042_20260421_141150 Paper: pytrain.20260421141150.042	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 14:12	Success	-	View
exp_self.20260421140451.164_20260421_140451 Paper: self.20260421140451.164	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421140451.164 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 14:05	Success	-	View
exp_self.20260421135734.163_20260421_135734 Paper: self.20260421135734.163	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421135734.163 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 13:58	Success	-	View
exp_self.20260421135020.162_20260421_135020 Paper: self.20260421135020.162	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421135020.162 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 13:51	Success	-	View
exp_self.20260421134257.161_20260421_134258 Paper: self.20260421134257.161	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421134257.161 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 13:44	Success	-	View
exp_pytrain.20260421134032.041_20260421_134032 Paper: pytrain.20260421134032.041	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 13:41	Success	-	View
exp_self.20260421133446.160_20260421_133446 Paper: self.20260421133446.160	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421133446.160 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 13:35	Success	-	View
exp_self.20260421132702.159_20260421_132702 Paper: self.20260421132702.159	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421132702.159 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 13:28	Success	-	View
exp_self.20260421131855.158_20260421_131855 Paper: self.20260421131855.158	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421131855.158 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 13:19	Success	-	View
exp_self.20260421131135.157_20260421_131135 Paper: self.20260421131135.157	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421131135.157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 13:12	Success	-	View
exp_pytrain.20260421130917.040_20260421_130918 Paper: pytrain.20260421130917.040	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 13:10	Success	-	View
exp_self.20260421130431.156_20260421_130432 Paper: self.20260421130431.156	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421130431.156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 13:05	Success	-	View
exp_self.20260421125626.155_20260421_125626 Paper: self.20260421125626.155	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421125626.155 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 12:57	Success	-	View
exp_self.20260421124821.154_20260421_124822 Paper: self.20260421124821.154	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421124821.154 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 12:49	Success	-	View
exp_self.20260421124017.153_20260421_124017 Paper: self.20260421124017.153	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421124017.153 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 12:41	Success	-	View
exp_pytrain.20260421123723.039_20260421_123723 Paper: pytrain.20260421123723.039	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 12:38	Success	-	View
exp_self.20260421123104.152_20260421_123104 Paper: self.20260421123104.152	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421123104.152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 12:32	Success	-	View
exp_self.20260421122300.151_20260421_122300 Paper: self.20260421122300.151	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421122300.151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 12:24	Success	-	View
exp_self.20260421121456.150_20260421_121456 Paper: self.20260421121456.150	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421121456.150 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 12:16	Success	-	View
exp_self.20260421120803.149_20260421_120803 Paper: self.20260421120803.149	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421120803.149 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 12:09	Success	-	View
exp_pytrain.20260421120510.038_20260421_120510 Paper: pytrain.20260421120510.038	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 12:06	Success	-	View
exp_self.20260421115734.148_20260421_115734 Paper: self.20260421115734.148	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421115734.148 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 11:58	Success	-	View
exp_self.20260421114931.147_20260421_114932 Paper: self.20260421114931.147	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421114931.147 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 11:50	Success	-	View
exp_self.20260421114235.146_20260421_114236 Paper: self.20260421114235.146	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421114235.146 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 11:43	Success	-	View
exp_self.20260421113535.145_20260421_113536 Paper: self.20260421113535.145	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421113535.145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 11:36	Success	-	View
exp_pytrain.20260421113312.037_20260421_113312 Paper: pytrain.20260421113312.037	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 11:34	Success	-	View
exp_self.20260421112654.144_20260421_112654 Paper: self.20260421112654.144	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421112654.144 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 11:27	Success	-	View
exp_self.20260421111907.143_20260421_111907 Paper: self.20260421111907.143	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421111907.143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 11:20	Success	-	View
exp_self.20260421111126.142_20260421_111126 Paper: self.20260421111126.142	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421111126.142 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 11:12	Success	-	View
exp_self.20260421110406.141_20260421_110406 Paper: self.20260421110406.141	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421110406.141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 11:05	Success	-	View
exp_pytrain.20260421110142.036_20260421_110142 Paper: pytrain.20260421110142.036	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 11:02	Success	-	View
exp_hf_2604.16498_20260421_105902 Paper: hf_2604.16498	Forge-UGC: FX optimization and register-graph engine for universal graph compiler Paper ID: hf_2604.16498 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 11:00	Success	-	View
exp_self.20260421105428.140_20260421_105429 Paper: self.20260421105428.140	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421105428.140 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 10:55	Success	-	View
exp_self.20260421104703.139_20260421_104703 Paper: self.20260421104703.139	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421104703.139 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 10:48	Success	-	View
exp_hf_2604.16830_20260421_104327 Paper: hf_2604.16830	The Illusion of Certainty: Decoupling Capability and Calibration in On-Policy Distillation Paper ID: hf_2604.16830 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 10:44	Success	-	View
exp_self.20260421103848.138_20260421_103849 Paper: self.20260421103848.138	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421103848.138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 10:39	Success	-	View
exp_self.20260421103122.137_20260421_103122 Paper: self.20260421103122.137	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421103122.137 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 10:32	Success	-	View
exp_pytrain.20260421102858.035_20260421_102858 Paper: pytrain.20260421102858.035	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 10:30	Success	-	View
exp_hf_2602.15143_20260421_102618 Paper: hf_2602.15143	Protecting Language Models Against Unauthorized Distillation through Trace Rewriting Paper ID: hf_2602.15143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 10:27	Success	-	View
exp_self.20260421102205.136_20260421_102205 Paper: self.20260421102205.136	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421102205.136 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 10:23	Success	-	View
exp_cr_10.1038_s41598-026-48666-1_20260421_101857 Paper: cr_10.1038_s41598-026-48666-1	Multimodal survival analysis of glioblastoma using whole-slide histopathology, gene expression, clinical variables and l... Paper ID: cr_10.1038_s41598-026-48666-1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-21 10:19	Success	-	View
exp_self.20260421101334.135_20260421_101334 Paper: self.20260421101334.135	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421101334.135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 10:14	Success	-	View
exp_self.20260421100611.134_20260421_100612 Paper: self.20260421100611.134	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421100611.134 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 10:07	Success	-	View
exp_self.20260421095839.133_20260421_095839 Paper: self.20260421095839.133	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421095839.133 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 09:59	Success	-	View
exp_pytrain.20260421095618.034_20260421_095619 Paper: pytrain.20260421095618.034	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 09:57	Success	-	View
exp_self.20260421095135.132_20260421_095135 Paper: self.20260421095135.132	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421095135.132 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 09:52	Success	-	View
exp_self.20260421094351.131_20260421_094352 Paper: self.20260421094351.131	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421094351.131 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 09:44	Success	-	View
exp_hf_2511.10262_20260421_093956 Paper: hf_2511.10262	MTR-DuplexBench: Towards a Comprehensive Evaluation of Multi-Round Conversations for Full-Duplex Speech Language Models Paper ID: hf_2511.10262 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 09:41	Success	-	View
exp_self.20260421093442.130_20260421_093442 Paper: self.20260421093442.130	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421093442.130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 09:35	Success	-	View
exp_hf_2604.15710_20260421_093126 Paper: hf_2604.15710	VoxMind: An End-to-End Agentic Spoken Dialogue System Paper ID: hf_2604.15710 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 09:32	Success	-	View
exp_self.20260421092701.129_20260421_092701 Paper: self.20260421092701.129	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421092701.129 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 09:28	Success	-	View
exp_pytrain.20260421092444.033_20260421_092444 Paper: pytrain.20260421092444.033	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 09:25	Success	-	View
exp_self.20260421091745.128_20260421_091745 Paper: self.20260421091745.128	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421091745.128 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 09:18	Success	-	View
exp_self.20260421091025.127_20260421_091025 Paper: self.20260421091025.127	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421091025.127 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 09:11	Success	-	View
exp_hf_2604.16576_20260421_090446 Paper: hf_2604.16576	On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability Paper ID: hf_2604.16576 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 09:05	Success	-	View
exp_self.20260421090250.126_20260421_090251 Paper: self.20260421090250.126	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421090250.126 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 09:03	Success	-	View
exp_self.20260421085450.125_20260421_085450 Paper: self.20260421085450.125	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421085450.125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 08:55	Success	-	View
exp_pytrain.20260421085149.032_20260421_085149 Paper: pytrain.20260421085149.032	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 08:52	Success	-	View
exp_hf_2604.17091_20260421_084646 Paper: hf_2604.17091	GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0) Paper ID: hf_2604.17091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 08:47	Success	-	View
exp_self.20260421084450.124_20260421_084451 Paper: self.20260421084450.124	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421084450.124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 08:45	Success	-	View
exp_self.20260421083730.123_20260421_083731 Paper: self.20260421083730.123	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421083730.123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 08:38	Success	-	View
exp_self.20260421083001.122_20260421_083001 Paper: self.20260421083001.122	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421083001.122 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 08:31	Success	-	View
exp_self.20260421082317.121_20260421_082318 Paper: self.20260421082317.121	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421082317.121 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 08:24	Success	-	View
exp_pytrain.20260421082020.031_20260421_082021 Paper: pytrain.20260421082020.031	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 08:21	Success	-	View
exp_self.20260421081324.120_20260421_081324 Paper: self.20260421081324.120	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421081324.120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 08:14	Success	-	View
exp_self.20260421080553.119_20260421_080553 Paper: self.20260421080553.119	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421080553.119 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 08:06	Success	-	View
exp_self.20260421075818.118_20260421_075818 Paper: self.20260421075818.118	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421075818.118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 07:59	Success	-	View
exp_self.20260421075045.117_20260421_075045 Paper: self.20260421075045.117	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421075045.117 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 07:51	Success	-	View
exp_pytrain.20260421074819.030_20260421_074820 Paper: pytrain.20260421074819.030	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 07:49	Success	-	View
exp_self.20260421074125.116_20260421_074126 Paper: self.20260421074125.116	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421074125.116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 07:42	Success	-	View
exp_self.20260421073352.115_20260421_073352 Paper: self.20260421073352.115	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421073352.115 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 07:34	Success	-	View
exp_self.20260421072621.114_20260421_072622 Paper: self.20260421072621.114	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421072621.114 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 07:27	Success	-	View
exp_self.20260421071851.113_20260421_071852 Paper: self.20260421071851.113	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421071851.113 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 07:19	Success	-	View
exp_pytrain.20260421071627.029_20260421_071627 Paper: pytrain.20260421071627.029	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 07:17	Success	-	View
exp_self.20260421070937.112_20260421_070938 Paper: self.20260421070937.112	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421070937.112 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 07:10	Success	-	View
exp_self.20260421070209.111_20260421_070210 Paper: self.20260421070209.111	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421070209.111 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 07:03	Success	-	View
exp_self.20260421065443.110_20260421_065444 Paper: self.20260421065443.110	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421065443.110 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 06:55	Success	-	View
exp_self.20260421064718.109_20260421_064718 Paper: self.20260421064718.109	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421064718.109 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 06:48	Success	-	View
exp_pytrain.20260421064459.028_20260421_064459 Paper: pytrain.20260421064459.028	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 06:46	Success	-	View
exp_cr_10.55041_isjem06670_20260421_063959 Paper: cr_10.55041_isjem06670	A Review of Quantization Techniques for Large Language Models: From Post-Training Quantization to Extreme 1-Bit Methods Paper ID: cr_10.55041_isjem06670 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	04-21 06:41	Success	-	View
exp_self.20260421063758.108_20260421_063758 Paper: self.20260421063758.108	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421063758.108 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 06:39	Success	-	View
exp_self.20260421063037.107_20260421_063037 Paper: self.20260421063037.107	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421063037.107 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 06:31	Success	-	View
exp_self.20260421062310.106_20260421_062311 Paper: self.20260421062310.106	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421062310.106 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 06:24	Success	-	View
exp_self.20260421061537.105_20260421_061537 Paper: self.20260421061537.105	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421061537.105 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 06:16	Success	-	View
exp_pytrain.20260421061305.027_20260421_061305 Paper: pytrain.20260421061305.027	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 06:14	Success	-	View
exp_self.20260421060611.104_20260421_060612 Paper: self.20260421060611.104	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421060611.104 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 06:07	Success	-	View
exp_self.20260421055840.103_20260421_055840 Paper: self.20260421055840.103	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421055840.103 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 05:59	Success	-	View
exp_self.20260421055105.102_20260421_055106 Paper: self.20260421055105.102	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421055105.102 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 05:52	Success	-	View
exp_self.20260421054335.101_20260421_054335 Paper: self.20260421054335.101	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421054335.101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 05:44	Success	-	View
exp_pytrain.20260421054107.026_20260421_054107 Paper: pytrain.20260421054107.026	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 05:42	Success	-	View
exp_self.20260421053414.100_20260421_053414 Paper: self.20260421053414.100	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421053414.100 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 05:35	Success	-	View
exp_self.20260421052644.099_20260421_052644 Paper: self.20260421052644.099	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421052644.099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 05:27	Success	-	View
exp_gh_bojobh609_TurboQuant_20260421_052115 Paper: gh_bojobh609_TurboQuant	bojobh609/TurboQuant Paper ID: gh_bojobh609_TurboQuant - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 05:22	Success	-	View
exp_self.20260421051914.098_20260421_051915 Paper: self.20260421051914.098	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421051914.098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 05:20	Success	-	View
exp_self.20260421051145.097_20260421_051146 Paper: self.20260421051145.097	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421051145.097 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 05:12	Success	-	View
exp_pytrain.20260421050924.025_20260421_050924 Paper: pytrain.20260421050924.025	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 05:10	Success	-	View
exp_self.20260421050228.096_20260421_050228 Paper: self.20260421050228.096	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421050228.096 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 05:03	Success	-	View
exp_self.20260421045505.095_20260421_045506 Paper: self.20260421045505.095	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421045505.095 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 04:56	Success	-	View
exp_self.20260421044735.094_20260421_044736 Paper: self.20260421044735.094	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421044735.094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 04:48	Success	-	View
exp_self.20260421044005.093_20260421_044005 Paper: self.20260421044005.093	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421044005.093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 04:41	Success	-	View
exp_pytrain.20260421043742.024_20260421_043743 Paper: pytrain.20260421043742.024	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 04:38	Success	-	View
exp_self.20260421043047.092_20260421_043048 Paper: self.20260421043047.092	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421043047.092 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 04:31	Success	-	View
exp_self.20260421042324.091_20260421_042325 Paper: self.20260421042324.091	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421042324.091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 04:24	Success	-	View
exp_self.20260421041604.090_20260421_041604 Paper: self.20260421041604.090	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421041604.090 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 04:17	Success	-	View
exp_self.20260421040833.089_20260421_040834 Paper: self.20260421040833.089	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421040833.089 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 04:09	Success	-	View
exp_pytrain.20260421040609.023_20260421_040609 Paper: pytrain.20260421040609.023	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 04:07	Success	-	View
exp_self.20260421035913.088_20260421_035913 Paper: self.20260421035913.088	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421035913.088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 04:00	Success	-	View
exp_self.20260421035151.087_20260421_035152 Paper: self.20260421035151.087	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421035151.087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 03:52	Success	-	View
exp_self.20260421034430.086_20260421_034430 Paper: self.20260421034430.086	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421034430.086 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 03:45	Success	-	View
exp_self.20260421033702.085_20260421_033702 Paper: self.20260421033702.085	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421033702.085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 03:38	Success	-	View
exp_pytrain.20260421033429.022_20260421_033430 Paper: pytrain.20260421033429.022	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 03:35	Success	-	View
exp_self.20260421032737.084_20260421_032738 Paper: self.20260421032737.084	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421032737.084 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 03:28	Success	-	View
exp_self.20260421032009.083_20260421_032009 Paper: self.20260421032009.083	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421032009.083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 03:21	Success	-	View
exp_self.20260421031242.082_20260421_031242 Paper: self.20260421031242.082	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421031242.082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 03:13	Success	-	View
exp_self.20260421030517.081_20260421_030517 Paper: self.20260421030517.081	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421030517.081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 03:06	Success	-	View
exp_pytrain.20260421030250.021_20260421_030251 Paper: pytrain.20260421030250.021	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 03:03	Success	-	View
exp_self.20260421025558.080_20260421_025559 Paper: self.20260421025558.080	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421025558.080 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 02:57	Success	-	View
exp_gh_berntpopp_phentrieve_20260421_025139 Paper: gh_berntpopp_phentrieve	berntpopp/phentrieve Paper ID: gh_berntpopp_phentrieve - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 02:52	Success	-	View
exp_self.20260421024831.079_20260421_024831 Paper: self.20260421024831.079	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421024831.079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 02:49	Success	-	View
exp_self.20260421024103.078_20260421_024104 Paper: self.20260421024103.078	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421024103.078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 02:42	Success	-	View
exp_self.20260421023330.077_20260421_023331 Paper: self.20260421023330.077	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421023330.077 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 02:34	Success	-	View
exp_pytrain.20260421023111.020_20260421_023112 Paper: pytrain.20260421023111.020	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 02:32	Success	-	View
exp_self.20260421022415.076_20260421_022415 Paper: self.20260421022415.076	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421022415.076 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 02:25	Success	-	View
exp_self.20260421021653.075_20260421_021653 Paper: self.20260421021653.075	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421021653.075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 02:17	Success	-	View
exp_self.20260421020914.074_20260421_020915 Paper: self.20260421020914.074	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421020914.074 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 02:10	Success	-	View
exp_self.20260421020144.073_20260421_020144 Paper: self.20260421020144.073	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421020144.073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 02:02	Success	-	View
exp_pytrain.20260421015924.019_20260421_015925 Paper: pytrain.20260421015924.019	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 02:00	Success	-	View
exp_self.20260421015226.072_20260421_015226 Paper: self.20260421015226.072	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421015226.072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 01:53	Success	-	View
exp_self.20260421014502.071_20260421_014503 Paper: self.20260421014502.071	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421014502.071 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 01:46	Success	-	View
exp_self.20260421013738.070_20260421_013739 Paper: self.20260421013738.070	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421013738.070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 01:38	Success	-	View
exp_self.20260421013008.069_20260421_013009 Paper: self.20260421013008.069	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421013008.069 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 01:31	Success	-	View
exp_pytrain.20260421012746.018_20260421_012746 Paper: pytrain.20260421012746.018	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 01:28	Success	-	View
exp_self.20260421012052.068_20260421_012052 Paper: self.20260421012052.068	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421012052.068 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 01:21	Success	-	View
exp_self.20260421011330.067_20260421_011331 Paper: self.20260421011330.067	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421011330.067 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 01:14	Success	-	View
exp_self.20260421010610.066_20260421_010611 Paper: self.20260421010610.066	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421010610.066 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 01:07	Success	-	View
exp_self.20260421005846.065_20260421_005847 Paper: self.20260421005846.065	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421005846.065 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 00:59	Success	-	View
exp_pytrain.20260421005620.017_20260421_005620 Paper: pytrain.20260421005620.017	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 00:57	Success	-	View
exp_self.20260421004924.064_20260421_004925 Paper: self.20260421004924.064	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421004924.064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 00:50	Success	-	View
exp_self.20260421004156.063_20260421_004156 Paper: self.20260421004156.063	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421004156.063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 00:42	Success	-	View
exp_self.20260421003431.062_20260421_003431 Paper: self.20260421003431.062	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421003431.062 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 00:35	Success	-	View
exp_self.20260421002708.061_20260421_002708 Paper: self.20260421002708.061	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421002708.061 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 00:28	Success	-	View
exp_pytrain.20260421002439.016_20260421_002440 Paper: pytrain.20260421002439.016	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-21 00:25	Success	-	View
exp_self.20260421001749.060_20260421_001749 Paper: self.20260421001749.060	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421001749.060 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 00:18	Success	-	View
exp_self.20260421001018.059_20260421_001019 Paper: self.20260421001018.059	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421001018.059 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 00:11	Success	-	View
exp_hf_2604.17388_20260421_000513 Paper: hf_2604.17388	Back to Repair: A Minimal Denoising Network\ for Time Series Anomaly Detection Paper ID: hf_2604.17388 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-21 00:06	Success	-	View
exp_self.20260421000318.058_20260421_000318 Paper: self.20260421000318.058	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260421000318.058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-21 00:04	Success	-	View
exp_self.20260420235506.057_20260420_235506 Paper: self.20260420235506.057	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420235506.057 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 23:56	Success	-	View
exp_pytrain.20260420235247.015_20260420_235248 Paper: pytrain.20260420235247.015	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 23:53	Success	-	View
exp_gh_whispering3_scao_20260420_235007 Paper: gh_whispering3_scao	whispering3/scao Paper ID: gh_whispering3_scao - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benc...	04-20 23:51	Success	-	View
exp_self.20260420234658.056_20260420_234658 Paper: self.20260420234658.056	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420234658.056 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 23:48	Success	-	View
exp_hf_2604.17454_20260420_234409 Paper: hf_2604.17454	HSG: Hyperbolic Scene Graph Paper ID: hf_2604.17454 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 23:45	Success	-	View
exp_self.20260420233710.055_20260420_233711 Paper: self.20260420233710.055	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420233710.055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 23:38	Success	-	View
exp_self.20260420232943.054_20260420_232943 Paper: self.20260420232943.054	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420232943.054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 23:30	Success	-	View
exp_self.20260420232220.053_20260420_232220 Paper: self.20260420232220.053	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420232220.053 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 23:23	Success	-	View
exp_pytrain.20260420231953.014_20260420_231953 Paper: pytrain.20260420231953.014	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 23:20	Success	-	View
exp_self.20260420231302.052_20260420_231303 Paper: self.20260420231302.052	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420231302.052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 23:14	Success	-	View
exp_self.20260420230537.051_20260420_230537 Paper: self.20260420230537.051	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420230537.051 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 23:06	Success	-	View
exp_2604.18584v1_20260420_230012 Paper: 2604.18584v1	MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval Paper ID: 2604.18584v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-20 23:01	Success	-	View
exp_self.20260420225812.050_20260420_225813 Paper: self.20260420225812.050	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420225812.050 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 22:59	Success	-	View
exp_2604.18580v1_20260420_225503 Paper: 2604.18580v1	Sessa: Selective State Space Attention Paper ID: 2604.18580v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-20 22:56	Success	-	View
exp_self.20260420225053.049_20260420_225054 Paper: self.20260420225053.049	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420225053.049 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 22:51	Success	-	View
exp_pytrain.20260420224826.013_20260420_224827 Paper: pytrain.20260420224826.013	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 22:49	Success	-	View
exp_hf_2604.18584_20260420_224546 Paper: hf_2604.18584	MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval Paper ID: hf_2604.18584 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 22:46	Success	-	View
exp_self.20260420224024.048_20260420_224025 Paper: self.20260420224024.048	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420224024.048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 22:41	Success	-	View
exp_self.20260420223254.047_20260420_223254 Paper: self.20260420223254.047	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420223254.047 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 22:33	Success	-	View
exp_self.20260420222534.046_20260420_222535 Paper: self.20260420222534.046	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420222534.046 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 22:26	Success	-	View
exp_self.20260420221813.045_20260420_221813 Paper: self.20260420221813.045	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420221813.045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 22:19	Success	-	View
exp_pytrain.20260420221547.012_20260420_221547 Paper: pytrain.20260420221547.012	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 22:16	Success	-	View
exp_hf_2604.08537_20260420_221333 Paper: hf_2604.08537	Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding Paper ID: hf_2604.08537 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 22:14	Success	-	View
exp_self.20260420221029.044_20260420_221029 Paper: self.20260420221029.044	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420221029.044 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 22:11	Success	-	View
exp_hf_2604.18486_20260420_220742 Paper: hf_2604.18486	OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper ID: hf_2604.18486 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 22:08	Success	-	View
exp_self.20260420220151.043_20260420_220151 Paper: self.20260420220151.043	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420220151.043 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 22:02	Success	-	View
exp_2604.18064v1_20260420_215624 Paper: 2604.18064v1	Understanding Human Actions through the Lens of Executable Models Paper ID: 2604.18064v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-20 21:57	Success	-	View
exp_self.20260420215424.042_20260420_215425 Paper: self.20260420215424.042	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420215424.042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 21:55	Success	-	View
exp_2604.18067v1_20260420_215109 Paper: 2604.18067v1	Towards Real-Time ECG and EMG Modeling on $μ$ NPUs Paper ID: 2604.18067v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-20 21:52	Success	-	View
exp_self.20260420214658.041_20260420_214658 Paper: self.20260420214658.041	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420214658.041 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 21:48	Success	-	View
exp_pytrain.20260420214431.011_20260420_214431 Paper: pytrain.20260420214431.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 21:45	Success	-	View
exp_self.20260420213746.040_20260420_213746 Paper: self.20260420213746.040	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420213746.040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 21:38	Success	-	View
exp_self.20260420213022.039_20260420_213022 Paper: self.20260420213022.039	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420213022.039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 21:31	Success	-	View
exp_self.20260420212259.038_20260420_212259 Paper: self.20260420212259.038	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420212259.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 21:24	Success	-	View
exp_self.20260420211539.037_20260420_211539 Paper: self.20260420211539.037	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420211539.037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 21:16	Success	-	View
exp_pytrain.20260420211315.010_20260420_211316 Paper: pytrain.20260420211315.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 21:14	Success	-	View
exp_self.20260420210632.036_20260420_210633 Paper: self.20260420210632.036	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420210632.036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 21:07	Success	-	View
exp_hf_2604.17696_20260420_210059 Paper: hf_2604.17696	Stratagem: Learning Transferable Reasoning via Trajectory-Modulated Game Self-Play Paper ID: hf_2604.17696 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 21:02	Success	-	View
exp_self.20260420205905.035_20260420_205906 Paper: self.20260420205905.035	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420205905.035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 21:00	Success	-	View
exp_self.20260420205144.034_20260420_205145 Paper: self.20260420205144.034	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420205144.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 20:52	Success	-	View
exp_hf_2604.17698_20260420_204827 Paper: hf_2604.17698	The Geometric Canary: Predicting Steerability and Detecting Drift via Representational Stability Paper ID: hf_2604.17698 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 20:49	Success	-	View
exp_self.20260420204420.033_20260420_204420 Paper: self.20260420204420.033	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420204420.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 20:45	Success	-	View
exp_pytrain.20260420204151.009_20260420_204151 Paper: pytrain.20260420204151.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 20:42	Success	-	View
exp_hf_2604.16642_20260420_203910 Paper: hf_2604.16642	Geometric coherence of single-cell CRISPR perturbations reveals regulatory architecture and predicts cellular stress Paper ID: hf_2604.16642 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 20:40	Success	-	View
exp_self.20260420203459.032_20260420_203459 Paper: self.20260420203459.032	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420203459.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 20:36	Success	-	View
exp_self.20260420202740.031_20260420_202740 Paper: self.20260420202740.031	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420202740.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 20:28	Success	-	View
exp_hf_2604.16503_20260420_202425 Paper: hf_2604.16503	Motif-Video 2B: Technical Report Paper ID: hf_2604.16503 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 20:25	Success	-	View
exp_self.20260420202013.030_20260420_202013 Paper: self.20260420202013.030	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420202013.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 20:21	Success	-	View
exp_self.20260420201253.029_20260420_201253 Paper: self.20260420201253.029	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420201253.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 20:13	Success	-	View
exp_pytrain.20260420201028.008_20260420_201029 Paper: pytrain.20260420201028.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 20:11	Success	-	View
exp_self.20260420200343.028_20260420_200344 Paper: self.20260420200343.028	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420200343.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 20:04	Success	-	View
exp_self.20260420195621.027_20260420_195621 Paper: self.20260420195621.027	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420195621.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 19:57	Success	-	View
exp_self.20260420194900.026_20260420_194901 Paper: self.20260420194900.026	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420194900.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 19:50	Success	-	View
exp_self.20260420194107.025_20260420_194108 Paper: self.20260420194107.025	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420194107.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 19:42	Success	-	View
exp_pytrain.20260420193810.007_20260420_193810 Paper: pytrain.20260420193810.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 19:39	Success	-	View
exp_gh_Labyrinthine-saltiness744_turboquant-mlx_20260420_193306 Paper: gh_Labyrinthine-saltiness744_turboquant-mlx	Labyrinthine-saltiness744/turboquant-mlx Paper ID: gh_Labyrinthine-saltiness744_turboquant-mlx - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expecte...	04-20 19:34	Success	-	View
exp_self.20260420193058.024_20260420_193058 Paper: self.20260420193058.024	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420193058.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 19:32	Success	-	View
exp_self.20260420192330.023_20260420_192331 Paper: self.20260420192330.023	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420192330.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 19:24	Success	-	View
exp_self.20260420191606.022_20260420_191606 Paper: self.20260420191606.022	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420191606.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 19:17	Success	-	View
exp_self.20260420190834.021_20260420_190835 Paper: self.20260420190834.021	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420190834.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 19:09	Success	-	View
exp_pytrain.20260420190606.006_20260420_190606 Paper: pytrain.20260420190606.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 19:07	Success	-	View
exp_self.20260420185907.020_20260420_185907 Paper: self.20260420185907.020	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420185907.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 19:00	Success	-	View
exp_self.20260420185144.019_20260420_185144 Paper: self.20260420185144.019	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420185144.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 18:52	Success	-	View
exp_self.20260420184419.018_20260420_184420 Paper: self.20260420184419.018	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420184419.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 18:45	Success	-	View
exp_self.20260420183654.017_20260420_183654 Paper: self.20260420183654.017	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420183654.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 18:37	Success	-	View
exp_pytrain.20260420183422.005_20260420_183423 Paper: pytrain.20260420183422.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 18:35	Success	-	View
exp_self.20260420182725.016_20260420_182726 Paper: self.20260420182725.016	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420182725.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 18:28	Success	-	View
exp_self.20260420181951.015_20260420_181952 Paper: self.20260420181951.015	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420181951.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 18:20	Success	-	View
exp_self.20260420181224.014_20260420_181225 Paper: self.20260420181224.014	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420181224.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 18:13	Success	-	View
exp_self.20260420180501.013_20260420_180502 Paper: self.20260420180501.013	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420180501.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 18:06	Success	-	View
exp_pytrain.20260420180227.004_20260420_180227 Paper: pytrain.20260420180227.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 18:03	Success	-	View
exp_self.20260420175534.012_20260420_175535 Paper: self.20260420175534.012	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420175534.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 17:56	Success	-	View
exp_self.20260420174805.011_20260420_174805 Paper: self.20260420174805.011	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420174805.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 17:49	Success	-	View
exp_self.20260420174037.010_20260420_174038 Paper: self.20260420174037.010	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420174037.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 17:41	Success	-	View
exp_self.20260420173308.009_20260420_173309 Paper: self.20260420173308.009	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420173308.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 17:34	Success	-	View
exp_pytrain.20260420173035.003_20260420_173036 Paper: pytrain.20260420173035.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 17:31	Success	-	View
exp_self.20260420172346.008_20260420_172347 Paper: self.20260420172346.008	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420172346.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 17:24	Success	-	View
exp_self.20260420171612.007_20260420_171613 Paper: self.20260420171612.007	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420171612.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 17:17	Success	-	View
exp_self.20260420170842.006_20260420_170842 Paper: self.20260420170842.006	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420170842.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 17:09	Success	-	View
exp_self.20260420170113.005_20260420_170114 Paper: self.20260420170113.005	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420170113.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 17:02	Success	-	View
exp_pytrain.20260420165843.002_20260420_165843 Paper: pytrain.20260420165843.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 16:59	Success	-	View
exp_self.20260420165149.004_20260420_165149 Paper: self.20260420165149.004	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420165149.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 16:52	Success	-	View
exp_self.20260420164419.003_20260420_164420 Paper: self.20260420164419.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420164419.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 16:45	Success	-	View
exp_self.20260420163650.002_20260420_163651 Paper: self.20260420163650.002	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420163650.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 16:37	Success	-	View
exp_self.20260420162923.001_20260420_162923 Paper: self.20260420162923.001	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420162923.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 16:30	Success	-	View
exp_pytrain.20260420162704.001_20260420_162704 Paper: pytrain.20260420162704.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 16:28	Success	-	View
exp_self.20260420144048.743_20260420_144049 Paper: self.20260420144048.743	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420144048.743 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 14:40	Pending	-	View
exp_self.20260420143322.742_20260420_143323 Paper: self.20260420143322.742	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420143322.742 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 14:34	Success	-	View
exp_self.20260420142550.741_20260420_142551 Paper: self.20260420142550.741	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420142550.741 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 14:26	Success	-	View
exp_self.20260420141820.740_20260420_141820 Paper: self.20260420141820.740	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420141820.740 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 14:19	Success	-	View
exp_pytrain.20260420141550.183_20260420_141550 Paper: pytrain.20260420141550.183	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 14:16	Success	-	View
exp_self.20260420140843.739_20260420_140844 Paper: self.20260420140843.739	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420140843.739 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 14:09	Success	-	View
exp_self.20260420140114.738_20260420_140115 Paper: self.20260420140114.738	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420140114.738 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 14:02	Success	-	View
exp_self.20260420135339.737_20260420_135339 Paper: self.20260420135339.737	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420135339.737 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 13:54	Success	-	View
exp_self.20260420134607.736_20260420_134607 Paper: self.20260420134607.736	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420134607.736 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 13:47	Success	-	View
exp_pytrain.20260420134337.182_20260420_134338 Paper: pytrain.20260420134337.182	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 13:44	Success	-	View
exp_self.20260420133630.735_20260420_133630 Paper: self.20260420133630.735	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420133630.735 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 13:37	Success	-	View
exp_self.20260420132859.734_20260420_132900 Paper: self.20260420132859.734	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420132859.734 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 13:30	Success	-	View
exp_self.20260420132130.733_20260420_132130 Paper: self.20260420132130.733	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420132130.733 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 13:22	Success	-	View
exp_self.20260420131355.732_20260420_131356 Paper: self.20260420131355.732	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420131355.732 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 13:14	Success	-	View
exp_pytrain.20260420131123.181_20260420_131123 Paper: pytrain.20260420131123.181	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 13:12	Success	-	View
exp_self.20260420130419.731_20260420_130419 Paper: self.20260420130419.731	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420130419.731 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 13:05	Success	-	View
exp_self.20260420125646.730_20260420_125646 Paper: self.20260420125646.730	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420125646.730 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 12:57	Success	-	View
exp_self.20260420124913.729_20260420_124913 Paper: self.20260420124913.729	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420124913.729 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 12:50	Success	-	View
exp_cr_10.1108_jbsed-05-2025-0135_20260420_124446 Paper: cr_10.1108_jbsed-05-2025-0135	Building smarter digital content: a CRITIC – DEMATEL framework for leveraging large language model optimization in marke... Paper ID: cr_10.1108_jbsed-05-2025-0135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-20 12:45	Success	-	View
exp_self.20260420124127.728_20260420_124128 Paper: self.20260420124127.728	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420124127.728 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 12:42	Success	-	View
exp_pytrain.20260420123850.180_20260420_123851 Paper: pytrain.20260420123850.180	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 12:39	Success	-	View
exp_self.20260420123153.727_20260420_123153 Paper: self.20260420123153.727	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420123153.727 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 12:32	Success	-	View
exp_self.20260420122416.726_20260420_122416 Paper: self.20260420122416.726	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420122416.726 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 12:25	Success	-	View
exp_self.20260420121639.725_20260420_121639 Paper: self.20260420121639.725	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420121639.725 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 12:17	Success	-	View
exp_self.20260420120906.724_20260420_120906 Paper: self.20260420120906.724	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420120906.724 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 12:10	Success	-	View
exp_pytrain.20260420120624.179_20260420_120624 Paper: pytrain.20260420120624.179	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 12:07	Success	-	View
exp_self.20260420115911.723_20260420_115912 Paper: self.20260420115911.723	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420115911.723 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 12:00	Success	-	View
exp_self.20260420115136.722_20260420_115136 Paper: self.20260420115136.722	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420115136.722 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 11:52	Success	-	View
exp_self.20260420114358.721_20260420_114358 Paper: self.20260420114358.721	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420114358.721 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 11:45	Success	-	View
exp_self.20260420113617.720_20260420_113618 Paper: self.20260420113617.720	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420113617.720 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 11:37	Success	-	View
exp_pytrain.20260420113344.178_20260420_113345 Paper: pytrain.20260420113344.178	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 11:34	Success	-	View
exp_self.20260420112743.719_20260420_112744 Paper: self.20260420112743.719	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420112743.719 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 11:28	Success	-	View
exp_self.20260420112006.718_20260420_112007 Paper: self.20260420112006.718	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420112006.718 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 11:21	Success	-	View
exp_self.20260420111232.717_20260420_111232 Paper: self.20260420111232.717	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420111232.717 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 11:13	Success	-	View
exp_self.20260420110501.716_20260420_110501 Paper: self.20260420110501.716	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420110501.716 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 11:06	Success	-	View
exp_pytrain.20260420110221.177_20260420_110221 Paper: pytrain.20260420110221.177	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 11:03	Success	-	View
exp_self.20260420105523.715_20260420_105524 Paper: self.20260420105523.715	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420105523.715 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 10:56	Success	-	View
exp_self.20260420104751.714_20260420_104751 Paper: self.20260420104751.714	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420104751.714 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 10:48	Success	-	View
exp_self.20260420104018.713_20260420_104018 Paper: self.20260420104018.713	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420104018.713 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 10:41	Success	-	View
exp_self.20260420103245.712_20260420_103245 Paper: self.20260420103245.712	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420103245.712 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 10:33	Success	-	View
exp_pytrain.20260420103005.176_20260420_103005 Paper: pytrain.20260420103005.176	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 10:31	Success	-	View
exp_self.20260420102311.711_20260420_102311 Paper: self.20260420102311.711	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420102311.711 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 10:24	Success	-	View
exp_self.20260420101529.710_20260420_101529 Paper: self.20260420101529.710	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420101529.710 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 10:16	Success	-	View
exp_self.20260420100748.709_20260420_100748 Paper: self.20260420100748.709	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420100748.709 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 10:08	Success	-	View
exp_self.20260420100008.708_20260420_100009 Paper: self.20260420100008.708	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420100008.708 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 10:01	Success	-	View
exp_pytrain.20260420095729.175_20260420_095729 Paper: pytrain.20260420095729.175	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 09:58	Success	-	View
exp_self.20260420095036.707_20260420_095036 Paper: self.20260420095036.707	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420095036.707 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 09:51	Success	-	View
exp_self.20260420094301.706_20260420_094301 Paper: self.20260420094301.706	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420094301.706 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 09:44	Success	-	View
exp_self.20260420093521.705_20260420_093522 Paper: self.20260420093521.705	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420093521.705 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 09:36	Success	-	View
exp_self.20260420092746.704_20260420_092747 Paper: self.20260420092746.704	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420092746.704 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 09:28	Success	-	View
exp_pytrain.20260420092517.174_20260420_092517 Paper: pytrain.20260420092517.174	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 09:26	Success	-	View
exp_self.20260420091928.703_20260420_091929 Paper: self.20260420091928.703	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420091928.703 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 09:20	Success	-	View
exp_self.20260420091144.702_20260420_091144 Paper: self.20260420091144.702	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420091144.702 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 09:12	Success	-	View
exp_self.20260420090352.701_20260420_090353 Paper: self.20260420090352.701	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420090352.701 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 09:04	Success	-	View
exp_self.20260420085617.700_20260420_085618 Paper: self.20260420085617.700	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420085617.700 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 08:57	Success	-	View
exp_pytrain.20260420085354.173_20260420_085354 Paper: pytrain.20260420085354.173	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 08:54	Success	-	View
exp_self.20260420084636.699_20260420_084636 Paper: self.20260420084636.699	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420084636.699 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 08:47	Success	-	View
exp_self.20260420083910.698_20260420_083910 Paper: self.20260420083910.698	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420083910.698 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 08:40	Success	-	View
exp_self.20260420083130.697_20260420_083130 Paper: self.20260420083130.697	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420083130.697 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 08:32	Success	-	View
exp_self.20260420082402.696_20260420_082402 Paper: self.20260420082402.696	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420082402.696 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 08:25	Success	-	View
exp_pytrain.20260420082145.172_20260420_082146 Paper: pytrain.20260420082145.172	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 08:22	Success	-	View
exp_self.20260420081446.695_20260420_081446 Paper: self.20260420081446.695	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420081446.695 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 08:15	Success	-	View
exp_self.20260420080726.694_20260420_080727 Paper: self.20260420080726.694	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420080726.694 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 08:08	Success	-	View
exp_self.20260420080006.693_20260420_080007 Paper: self.20260420080006.693	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420080006.693 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 08:01	Success	-	View
exp_hf_2604.16027_20260420_075431 Paper: hf_2604.16027	Where does output diversity collapse in post-training? Paper ID: hf_2604.16027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 07:55	Success	-	View
exp_self.20260420075238.692_20260420_075238 Paper: self.20260420075238.692	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420075238.692 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 07:53	Success	-	View
exp_pytrain.20260420075013.171_20260420_075013 Paper: pytrain.20260420075013.171	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 07:51	Success	-	View
exp_self.20260420074327.691_20260420_074328 Paper: self.20260420074327.691	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420074327.691 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 07:44	Success	-	View
exp_self.20260420073603.690_20260420_073603 Paper: self.20260420073603.690	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420073603.690 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 07:37	Success	-	View
exp_self.20260420072840.689_20260420_072840 Paper: self.20260420072840.689	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420072840.689 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 07:29	Success	-	View
exp_self.20260420072121.688_20260420_072121 Paper: self.20260420072121.688	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420072121.688 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 07:22	Success	-	View
exp_pytrain.20260420071859.170_20260420_071900 Paper: pytrain.20260420071859.170	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 07:20	Success	-	View
exp_self.20260420071201.687_20260420_071201 Paper: self.20260420071201.687	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420071201.687 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 07:13	Success	-	View
exp_hf_2604.15923_20260420_070744 Paper: hf_2604.15923	Hierarchical Codec Diffusion for Video-to-Speech Generation Paper ID: hf_2604.15923 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 07:08	Success	-	View
exp_self.20260420070442.686_20260420_070443 Paper: self.20260420070442.686	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420070442.686 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 07:05	Success	-	View
exp_self.20260420065724.685_20260420_065724 Paper: self.20260420065724.685	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420065724.685 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 06:58	Success	-	View
exp_self.20260420065003.684_20260420_065003 Paper: self.20260420065003.684	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420065003.684 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 06:51	Success	-	View
exp_pytrain.20260420064745.169_20260420_064746 Paper: pytrain.20260420064745.169	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 06:48	Success	-	View
exp_self.20260420064036.683_20260420_064037 Paper: self.20260420064036.683	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420064036.683 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 06:41	Success	-	View
exp_self.20260420063318.682_20260420_063319 Paper: self.20260420063318.682	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420063318.682 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 06:34	Success	-	View
exp_self.20260420062552.681_20260420_062552 Paper: self.20260420062552.681	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420062552.681 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 06:26	Success	-	View
exp_self.20260420061829.680_20260420_061829 Paper: self.20260420061829.680	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420061829.680 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 06:19	Success	-	View
exp_pytrain.20260420061611.168_20260420_061611 Paper: pytrain.20260420061611.168	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 06:17	Success	-	View
exp_self.20260420060919.679_20260420_060920 Paper: self.20260420060919.679	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420060919.679 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 06:10	Success	-	View
exp_self.20260420060203.678_20260420_060204 Paper: self.20260420060203.678	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420060203.678 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 06:03	Success	-	View
exp_self.20260420055439.677_20260420_055439 Paper: self.20260420055439.677	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420055439.677 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 05:55	Success	-	View
exp_self.20260420054712.676_20260420_054713 Paper: self.20260420054712.676	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420054712.676 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 05:48	Success	-	View
exp_pytrain.20260420054454.167_20260420_054455 Paper: pytrain.20260420054454.167	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 05:45	Success	-	View
exp_self.20260420053802.675_20260420_053803 Paper: self.20260420053802.675	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420053802.675 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 05:39	Success	-	View
exp_self.20260420053042.674_20260420_053042 Paper: self.20260420053042.674	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420053042.674 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 05:31	Success	-	View
exp_self.20260420052238.673_20260420_052239 Paper: self.20260420052238.673	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420052238.673 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 05:23	Success	-	View
exp_self.20260420051442.672_20260420_051442 Paper: self.20260420051442.672	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420051442.672 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 05:15	Success	-	View
exp_pytrain.20260420051214.166_20260420_051214 Paper: pytrain.20260420051214.166	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 05:13	Success	-	View
exp_self.20260420050509.671_20260420_050509 Paper: self.20260420050509.671	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420050509.671 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 05:06	Success	-	View
exp_self.20260420045731.670_20260420_045732 Paper: self.20260420045731.670	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420045731.670 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 04:58	Success	-	View
exp_self.20260420044955.669_20260420_044955 Paper: self.20260420044955.669	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420044955.669 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 04:50	Success	-	View
exp_self.20260420044214.668_20260420_044214 Paper: self.20260420044214.668	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420044214.668 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 04:43	Success	-	View
exp_pytrain.20260420043942.165_20260420_043943 Paper: pytrain.20260420043942.165	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 04:40	Success	-	View
exp_self.20260420043244.667_20260420_043244 Paper: self.20260420043244.667	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420043244.667 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 04:33	Success	-	View
exp_hf_2604.12012_20260420_042820 Paper: hf_2604.12012	TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment Paper ID: hf_2604.12012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 04:29	Success	-	View
exp_self.20260420042505.666_20260420_042506 Paper: self.20260420042505.666	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420042505.666 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 04:26	Success	-	View
exp_self.20260420041723.665_20260420_041724 Paper: self.20260420041723.665	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420041723.665 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 04:18	Success	-	View
exp_self.20260420040953.664_20260420_040953 Paper: self.20260420040953.664	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420040953.664 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 04:10	Success	-	View
exp_pytrain.20260420040726.164_20260420_040726 Paper: pytrain.20260420040726.164	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 04:08	Success	-	View
exp_self.20260420040018.663_20260420_040018 Paper: self.20260420040018.663	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420040018.663 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 04:01	Success	-	View
exp_self.20260420035249.662_20260420_035250 Paper: self.20260420035249.662	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420035249.662 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 03:53	Success	-	View
exp_self.20260420034510.661_20260420_034510 Paper: self.20260420034510.661	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420034510.661 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 03:46	Success	-	View
exp_self.20260420033738.660_20260420_033739 Paper: self.20260420033738.660	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420033738.660 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 03:38	Success	-	View
exp_pytrain.20260420033514.163_20260420_033514 Paper: pytrain.20260420033514.163	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 03:36	Success	-	View
exp_self.20260420033055.659_20260420_033055 Paper: self.20260420033055.659	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420033055.659 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 03:31	Success	-	View
exp_self.20260420032322.658_20260420_032323 Paper: self.20260420032322.658	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420032322.658 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 03:24	Success	-	View
exp_hf_2604.14663_20260420_032024 Paper: hf_2604.14663	EdgeDetect: Importance-Aware Gradient Compression with Homomorphic Aggregation for Federated Intrusion Detection Paper ID: hf_2604.14663 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 03:21	Success	-	View
exp_self.20260420031313.657_20260420_031314 Paper: self.20260420031313.657	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420031313.657 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 03:14	Success	-	View
exp_self.20260420030546.656_20260420_030546 Paper: self.20260420030546.656	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420030546.656 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 03:06	Success	-	View
exp_pytrain.20260420030311.162_20260420_030311 Paper: pytrain.20260420030311.162	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 03:04	Success	-	View
exp_self.20260420025611.655_20260420_025612 Paper: self.20260420025611.655	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420025611.655 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 02:57	Success	-	View
exp_self.20260420024837.654_20260420_024838 Paper: self.20260420024837.654	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420024837.654 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 02:49	Success	-	View
exp_self.20260420024106.653_20260420_024107 Paper: self.20260420024106.653	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420024106.653 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 02:42	Success	-	View
exp_self.20260420023337.652_20260420_023337 Paper: self.20260420023337.652	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420023337.652 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 02:34	Success	-	View
exp_pytrain.20260420023104.161_20260420_023105 Paper: pytrain.20260420023104.161	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 02:32	Success	-	View
exp_self.20260420022412.651_20260420_022412 Paper: self.20260420022412.651	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420022412.651 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 02:25	Success	-	View
exp_self.20260420021632.650_20260420_021633 Paper: self.20260420021632.650	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420021632.650 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 02:17	Success	-	View
exp_self.20260420020900.649_20260420_020900 Paper: self.20260420020900.649	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420020900.649 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 02:10	Success	-	View
exp_self.20260420020128.648_20260420_020129 Paper: self.20260420020128.648	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420020128.648 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 02:02	Success	-	View
exp_pytrain.20260420015858.160_20260420_015858 Paper: pytrain.20260420015858.160	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 02:00	Success	-	View
exp_self.20260420015204.647_20260420_015205 Paper: self.20260420015204.647	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420015204.647 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 01:53	Success	-	View
exp_self.20260420014432.646_20260420_014433 Paper: self.20260420014432.646	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420014432.646 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 01:45	Success	-	View
exp_self.20260420013703.645_20260420_013704 Paper: self.20260420013703.645	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420013703.645 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 01:38	Success	-	View
exp_self.20260420012940.644_20260420_012940 Paper: self.20260420012940.644	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420012940.644 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 01:30	Success	-	View
exp_pytrain.20260420012715.159_20260420_012716 Paper: pytrain.20260420012715.159	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 01:28	Success	-	View
exp_self.20260420012022.643_20260420_012022 Paper: self.20260420012022.643	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420012022.643 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 01:21	Success	-	View
exp_self.20260420011253.642_20260420_011253 Paper: self.20260420011253.642	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420011253.642 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 01:13	Success	-	View
exp_self.20260420010520.641_20260420_010521 Paper: self.20260420010520.641	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420010520.641 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 01:06	Success	-	View
exp_self.20260420005755.640_20260420_005755 Paper: self.20260420005755.640	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420005755.640 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 00:58	Success	-	View
exp_pytrain.20260420005530.158_20260420_005531 Paper: pytrain.20260420005530.158	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 00:56	Success	-	View
exp_self.20260420004833.639_20260420_004833 Paper: self.20260420004833.639	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420004833.639 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 00:49	Success	-	View
exp_self.20260420004108.638_20260420_004109 Paper: self.20260420004108.638	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420004108.638 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 00:42	Success	-	View
exp_self.20260420003340.637_20260420_003341 Paper: self.20260420003340.637	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420003340.637 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 00:34	Success	-	View
exp_self.20260420002613.636_20260420_002613 Paper: self.20260420002613.636	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420002613.636 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 00:27	Success	-	View
exp_pytrain.20260420002350.157_20260420_002350 Paper: pytrain.20260420002350.157	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-20 00:24	Success	-	View
exp_hf_2511.15915_20260420_001955 Paper: hf_2511.15915	AccelOpt: A Self-Improving LLM Agentic System for AI Accelerator Kernel Optimization Paper ID: hf_2511.15915 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 00:20	Success	-	View
exp_self.20260420001644.635_20260420_001645 Paper: self.20260420001644.635	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420001644.635 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 00:17	Success	-	View
exp_self.20260420000919.634_20260420_000920 Paper: self.20260420000919.634	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420000919.634 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 00:10	Success	-	View
exp_hf_2604.16299_20260420_000500 Paper: hf_2604.16299	Repurposing 3D Generative Model for Autoregressive Layout Generation Paper ID: hf_2604.16299 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-20 00:06	Success	-	View
exp_self.20260420000150.633_20260420_000151 Paper: self.20260420000150.633	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260420000150.633 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-20 00:02	Success	-	View
exp_self.20260419235423.632_20260419_235423 Paper: self.20260419235423.632	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419235423.632 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 23:55	Success	-	View
exp_pytrain.20260419235158.156_20260419_235158 Paper: pytrain.20260419235158.156	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 23:53	Success	-	View
exp_self.20260419234458.631_20260419_234459 Paper: self.20260419234458.631	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419234458.631 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 23:46	Success	-	View
exp_self.20260419233731.630_20260419_233731 Paper: self.20260419233731.630	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419233731.630 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 23:38	Success	-	View
exp_self.20260419233002.629_20260419_233002 Paper: self.20260419233002.629	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419233002.629 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 23:31	Success	-	View
exp_self.20260419232234.628_20260419_232235 Paper: self.20260419232234.628	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419232234.628 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 23:23	Success	-	View
exp_pytrain.20260419232005.155_20260419_232006 Paper: pytrain.20260419232005.155	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 23:21	Success	-	View
exp_self.20260419231306.627_20260419_231307 Paper: self.20260419231306.627	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419231306.627 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 23:14	Success	-	View
exp_self.20260419230540.626_20260419_230541 Paper: self.20260419230540.626	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419230540.626 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 23:06	Success	-	View
exp_self.20260419225818.625_20260419_225818 Paper: self.20260419225818.625	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419225818.625 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 22:59	Success	-	View
exp_self.20260419225049.624_20260419_225049 Paper: self.20260419225049.624	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419225049.624 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 22:51	Success	-	View
exp_pytrain.20260419224822.154_20260419_224822 Paper: pytrain.20260419224822.154	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 22:49	Success	-	View
exp_self.20260419224122.623_20260419_224122 Paper: self.20260419224122.623	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419224122.623 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 22:42	Success	-	View
exp_self.20260419223357.622_20260419_223357 Paper: self.20260419223357.622	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419223357.622 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 22:35	Success	-	View
exp_self.20260419222636.621_20260419_222636 Paper: self.20260419222636.621	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419222636.621 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 22:27	Success	-	View
exp_self.20260419221909.620_20260419_221910 Paper: self.20260419221909.620	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419221909.620 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 22:20	Success	-	View
exp_pytrain.20260419221638.153_20260419_221638 Paper: pytrain.20260419221638.153	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 22:17	Success	-	View
exp_self.20260419221224.619_20260419_221224 Paper: self.20260419221224.619	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419221224.619 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 22:13	Success	-	View
exp_hf_2604.16029_20260419_220803 Paper: hf_2604.16029	Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning Paper ID: hf_2604.16029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-19 22:09	Success	-	View
exp_self.20260419220449.618_20260419_220450 Paper: self.20260419220449.618	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419220449.618 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 22:05	Success	-	View
exp_self.20260419215720.617_20260419_215720 Paper: self.20260419215720.617	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419215720.617 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 21:58	Success	-	View
exp_2604.16298v1_20260419_215151 Paper: 2604.16298v1	FineCog-Nav: Integrating Fine-grained Cognitive Modules for Zero-shot Multimodal UAV Navigation Paper ID: 2604.16298v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-19 21:52	Success	-	View
exp_self.20260419214940.616_20260419_214940 Paper: self.20260419214940.616	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419214940.616 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 21:50	Success	-	View
exp_2604.16299v1_20260419_214648 Paper: 2604.16299v1	Repurposing 3D Generative Model for Autoregressive Layout Generation Paper ID: 2604.16299v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-19 21:47	Success	-	View
exp_pytrain.20260419214443.152_20260419_214444 Paper: pytrain.20260419214443.152	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 21:45	Success	-	View
exp_self.20260419214025.615_20260419_214026 Paper: self.20260419214025.615	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419214025.615 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 21:41	Success	-	View
exp_self.20260419213250.614_20260419_213251 Paper: self.20260419213250.614	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419213250.614 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 21:33	Success	-	View
exp_self.20260419212519.613_20260419_212519 Paper: self.20260419212519.613	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419212519.613 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 21:26	Success	-	View
exp_self.20260419211729.612_20260419_211729 Paper: self.20260419211729.612	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419211729.612 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 21:18	Success	-	View
exp_pytrain.20260419211223.151_20260419_211223 Paper: pytrain.20260419211223.151	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 21:13	Success	-	View
exp_self.20260419210959.611_20260419_211000 Paper: self.20260419210959.611	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419210959.611 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 21:11	Success	-	View
exp_hf_2604.15804_20260419_210625 Paper: hf_2604.15804	Qwen3.5-Omni Technical Report Paper ID: hf_2604.15804 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-19 21:07	Success	-	View
exp_gh_arsalanafzal010_SmartRAG_20260419_210120 Paper: gh_arsalanafzal010_SmartRAG	arsalanafzal010/SmartRAG Paper ID: gh_arsalanafzal010_SmartRAG - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recove...	04-19 21:02	Success	-	View
exp_self.20260419205909.610_20260419_205909 Paper: self.20260419205909.610	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419205909.610 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 21:00	Success	-	View
exp_2604.16205v1_20260419_205549 Paper: 2604.16205v1	ChemGraph-XANES: An Agentic Framework for XANES Simulation and Analysis Paper ID: 2604.16205v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-19 20:56	Success	-	View
exp_self.20260419205013.609_20260419_205014 Paper: self.20260419205013.609	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419205013.609 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 20:51	Success	-	View
exp_2604.16207v1_20260419_204446 Paper: 2604.16207v1	AIFIND: Artifact-Aware Interpreting Fine-Grained Alignment for Incremental Face Forgery Detection Paper ID: 2604.16207v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-19 20:45	Success	-	View
exp_self.20260419204235.608_20260419_204236 Paper: self.20260419204235.608	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419204235.608 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 20:43	Success	-	View
exp_pytrain.20260419204002.150_20260419_204002 Paper: pytrain.20260419204002.150	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 20:41	Success	-	View
exp_self.20260419203305.607_20260419_203305 Paper: self.20260419203305.607	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419203305.607 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 20:34	Success	-	View
exp_self.20260419202533.606_20260419_202534 Paper: self.20260419202533.606	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419202533.606 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 20:26	Success	-	View
exp_self.20260419201810.605_20260419_201810 Paper: self.20260419201810.605	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419201810.605 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 20:19	Success	-	View
exp_self.20260419201041.604_20260419_201041 Paper: self.20260419201041.604	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419201041.604 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 20:11	Success	-	View
exp_pytrain.20260419200803.149_20260419_200803 Paper: pytrain.20260419200803.149	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 20:09	Success	-	View
exp_self.20260419200111.603_20260419_200112 Paper: self.20260419200111.603	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419200111.603 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 20:02	Success	-	View
exp_self.20260419195339.602_20260419_195339 Paper: self.20260419195339.602	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419195339.602 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 19:54	Success	-	View
exp_self.20260419194609.601_20260419_194609 Paper: self.20260419194609.601	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419194609.601 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 19:47	Success	-	View
exp_self.20260419193841.600_20260419_193841 Paper: self.20260419193841.600	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419193841.600 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 19:39	Success	-	View
exp_pytrain.20260419193606.148_20260419_193606 Paper: pytrain.20260419193606.148	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 19:37	Success	-	View
exp_self.20260419192913.599_20260419_192914 Paper: self.20260419192913.599	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419192913.599 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 19:30	Success	-	View
exp_self.20260419192146.598_20260419_192146 Paper: self.20260419192146.598	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419192146.598 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 19:22	Success	-	View
exp_self.20260419191419.597_20260419_191419 Paper: self.20260419191419.597	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419191419.597 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 19:15	Success	-	View
exp_self.20260419190652.596_20260419_190652 Paper: self.20260419190652.596	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419190652.596 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 19:07	Success	-	View
exp_pytrain.20260419190421.147_20260419_190421 Paper: pytrain.20260419190421.147	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 19:05	Success	-	View
exp_self.20260419185729.595_20260419_185730 Paper: self.20260419185729.595	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419185729.595 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 18:58	Success	-	View
exp_self.20260419185000.594_20260419_185000 Paper: self.20260419185000.594	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419185000.594 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 18:51	Success	-	View
exp_self.20260419184226.593_20260419_184227 Paper: self.20260419184226.593	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419184226.593 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 18:43	Success	-	View
exp_self.20260419183458.592_20260419_183458 Paper: self.20260419183458.592	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419183458.592 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 18:36	Success	-	View
exp_pytrain.20260419183221.146_20260419_183221 Paper: pytrain.20260419183221.146	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 18:33	Success	-	View
exp_self.20260419182529.591_20260419_182529 Paper: self.20260419182529.591	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419182529.591 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 18:26	Success	-	View
exp_self.20260419181801.590_20260419_181802 Paper: self.20260419181801.590	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419181801.590 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 18:19	Success	-	View
exp_self.20260419181034.589_20260419_181034 Paper: self.20260419181034.589	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419181034.589 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 18:11	Success	-	View
exp_self.20260419180310.588_20260419_180310 Paper: self.20260419180310.588	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419180310.588 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 18:04	Success	-	View
exp_pytrain.20260419180044.145_20260419_180044 Paper: pytrain.20260419180044.145	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 18:01	Success	-	View
exp_self.20260419175351.587_20260419_175351 Paper: self.20260419175351.587	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419175351.587 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 17:54	Success	-	View
exp_self.20260419174627.586_20260419_174627 Paper: self.20260419174627.586	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419174627.586 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 17:47	Success	-	View
exp_self.20260419173846.585_20260419_173847 Paper: self.20260419173846.585	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419173846.585 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 17:39	Success	-	View
exp_self.20260419173109.584_20260419_173109 Paper: self.20260419173109.584	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419173109.584 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 17:32	Success	-	View
exp_pytrain.20260419172841.144_20260419_172842 Paper: pytrain.20260419172841.144	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 17:29	Success	-	View
exp_self.20260419172148.583_20260419_172148 Paper: self.20260419172148.583	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419172148.583 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 17:22	Success	-	View
exp_self.20260419171425.582_20260419_171425 Paper: self.20260419171425.582	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419171425.582 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 17:15	Success	-	View
exp_self.20260419170654.581_20260419_170654 Paper: self.20260419170654.581	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419170654.581 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 17:07	Success	-	View
exp_self.20260419165915.580_20260419_165915 Paper: self.20260419165915.580	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419165915.580 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 17:00	Success	-	View
exp_pytrain.20260419165648.143_20260419_165648 Paper: pytrain.20260419165648.143	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 16:57	Success	-	View
exp_self.20260419164954.579_20260419_164954 Paper: self.20260419164954.579	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419164954.579 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 16:50	Success	-	View
exp_self.20260419164229.578_20260419_164230 Paper: self.20260419164229.578	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419164229.578 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 16:43	Success	-	View
exp_self.20260419163505.577_20260419_163505 Paper: self.20260419163505.577	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419163505.577 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 16:36	Success	-	View
exp_self.20260419162737.576_20260419_162737 Paper: self.20260419162737.576	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419162737.576 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 16:28	Success	-	View
exp_pytrain.20260419162506.142_20260419_162507 Paper: pytrain.20260419162506.142	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 16:26	Success	-	View
exp_self.20260419161811.575_20260419_161811 Paper: self.20260419161811.575	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419161811.575 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 16:19	Success	-	View
exp_self.20260419161045.574_20260419_161045 Paper: self.20260419161045.574	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419161045.574 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 16:11	Success	-	View
exp_self.20260419160310.573_20260419_160310 Paper: self.20260419160310.573	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419160310.573 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 16:04	Success	-	View
exp_self.20260419155532.572_20260419_155533 Paper: self.20260419155532.572	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419155532.572 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 15:56	Success	-	View
exp_pytrain.20260419155257.141_20260419_155257 Paper: pytrain.20260419155257.141	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 15:54	Success	-	View
exp_self.20260419154600.571_20260419_154600 Paper: self.20260419154600.571	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419154600.571 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 15:47	Success	-	View
exp_self.20260419153830.570_20260419_153831 Paper: self.20260419153830.570	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419153830.570 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 15:39	Success	-	View
exp_self.20260419153101.569_20260419_153102 Paper: self.20260419153101.569	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419153101.569 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 15:32	Success	-	View
exp_self.20260419152327.568_20260419_152327 Paper: self.20260419152327.568	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419152327.568 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 15:24	Success	-	View
exp_pytrain.20260419152047.140_20260419_152047 Paper: pytrain.20260419152047.140	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 15:21	Success	-	View
exp_self.20260419151349.567_20260419_151350 Paper: self.20260419151349.567	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419151349.567 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 15:14	Success	-	View
exp_self.20260419150616.566_20260419_150617 Paper: self.20260419150616.566	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419150616.566 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 15:07	Success	-	View
exp_self.20260419145845.565_20260419_145845 Paper: self.20260419145845.565	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419145845.565 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 14:59	Success	-	View
exp_self.20260419145119.564_20260419_145120 Paper: self.20260419145119.564	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419145119.564 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 14:52	Success	-	View
exp_pytrain.20260419144842.139_20260419_144842 Paper: pytrain.20260419144842.139	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 14:49	Success	-	View
exp_self.20260419144146.563_20260419_144146 Paper: self.20260419144146.563	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419144146.563 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 14:42	Success	-	View
exp_self.20260419143416.562_20260419_143416 Paper: self.20260419143416.562	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419143416.562 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 14:35	Success	-	View
exp_self.20260419142647.561_20260419_142648 Paper: self.20260419142647.561	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419142647.561 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 14:27	Success	-	View
exp_self.20260419141918.560_20260419_141918 Paper: self.20260419141918.560	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419141918.560 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 14:20	Success	-	View
exp_pytrain.20260419141643.138_20260419_141644 Paper: pytrain.20260419141643.138	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 14:17	Success	-	View
exp_self.20260419140950.559_20260419_140950 Paper: self.20260419140950.559	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419140950.559 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 14:10	Success	-	View
exp_self.20260419140218.558_20260419_140219 Paper: self.20260419140218.558	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419140218.558 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 14:03	Success	-	View
exp_self.20260419135445.557_20260419_135446 Paper: self.20260419135445.557	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419135445.557 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 13:55	Success	-	View
exp_self.20260419134717.556_20260419_134717 Paper: self.20260419134717.556	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419134717.556 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 13:48	Success	-	View
exp_pytrain.20260419134442.137_20260419_134442 Paper: pytrain.20260419134442.137	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 13:45	Success	-	View
exp_self.20260419133740.555_20260419_133740 Paper: self.20260419133740.555	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419133740.555 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 13:38	Success	-	View
exp_self.20260419133011.554_20260419_133011 Paper: self.20260419133011.554	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419133011.554 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 13:31	Success	-	View
exp_self.20260419132245.553_20260419_132246 Paper: self.20260419132245.553	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419132245.553 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 13:23	Success	-	View
exp_self.20260419131517.552_20260419_131517 Paper: self.20260419131517.552	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419131517.552 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 13:16	Success	-	View
exp_pytrain.20260419131249.136_20260419_131249 Paper: pytrain.20260419131249.136	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 13:13	Success	-	View
exp_self.20260419130659.551_20260419_130659 Paper: self.20260419130659.551	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419130659.551 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 13:08	Success	-	View
exp_self.20260419125854.550_20260419_125854 Paper: self.20260419125854.550	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419125854.550 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 12:59	Success	-	View
exp_self.20260419125120.549_20260419_125120 Paper: self.20260419125120.549	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419125120.549 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 12:52	Success	-	View
exp_self.20260419124350.548_20260419_124350 Paper: self.20260419124350.548	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419124350.548 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 12:44	Success	-	View
exp_pytrain.20260419124124.135_20260419_124124 Paper: pytrain.20260419124124.135	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 12:42	Success	-	View
exp_self.20260419123429.547_20260419_123430 Paper: self.20260419123429.547	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419123429.547 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 12:35	Success	-	View
exp_self.20260419122704.546_20260419_122705 Paper: self.20260419122704.546	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419122704.546 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 12:28	Success	-	View
exp_self.20260419121934.545_20260419_121934 Paper: self.20260419121934.545	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419121934.545 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 12:20	Success	-	View
exp_self.20260419121155.544_20260419_121155 Paper: self.20260419121155.544	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419121155.544 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 12:12	Success	-	View
exp_pytrain.20260419120926.134_20260419_120926 Paper: pytrain.20260419120926.134	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 12:10	Success	-	View
exp_self.20260419120231.543_20260419_120232 Paper: self.20260419120231.543	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419120231.543 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 12:03	Success	-	View
exp_self.20260419115506.542_20260419_115507 Paper: self.20260419115506.542	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419115506.542 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 11:56	Success	-	View
exp_self.20260419114737.541_20260419_114737 Paper: self.20260419114737.541	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419114737.541 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 11:48	Success	-	View
exp_self.20260419114004.540_20260419_114004 Paper: self.20260419114004.540	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419114004.540 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 11:41	Success	-	View
exp_pytrain.20260419113726.133_20260419_113726 Paper: pytrain.20260419113726.133	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 11:38	Success	-	View
exp_self.20260419113022.539_20260419_113022 Paper: self.20260419113022.539	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419113022.539 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 11:31	Success	-	View
exp_self.20260419112252.538_20260419_112253 Paper: self.20260419112252.538	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419112252.538 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 11:23	Success	-	View
exp_self.20260419111523.537_20260419_111524 Paper: self.20260419111523.537	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419111523.537 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 11:16	Success	-	View
exp_self.20260419110751.536_20260419_110751 Paper: self.20260419110751.536	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419110751.536 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 11:08	Success	-	View
exp_pytrain.20260419110520.132_20260419_110520 Paper: pytrain.20260419110520.132	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 11:06	Success	-	View
exp_self.20260419105823.535_20260419_105823 Paper: self.20260419105823.535	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419105823.535 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 10:59	Success	-	View
exp_self.20260419105044.534_20260419_105045 Paper: self.20260419105044.534	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419105044.534 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 10:51	Success	-	View
exp_self.20260419104318.533_20260419_104319 Paper: self.20260419104318.533	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419104318.533 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 10:44	Success	-	View
exp_self.20260419103550.532_20260419_103551 Paper: self.20260419103550.532	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419103550.532 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 10:36	Success	-	View
exp_pytrain.20260419103315.131_20260419_103315 Paper: pytrain.20260419103315.131	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 10:34	Success	-	View
exp_self.20260419102620.531_20260419_102621 Paper: self.20260419102620.531	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419102620.531 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 10:27	Success	-	View
exp_self.20260419101913.530_20260419_101914 Paper: self.20260419101913.530	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419101913.530 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 10:20	Success	-	View
exp_self.20260419101135.529_20260419_101136 Paper: self.20260419101135.529	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419101135.529 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 10:12	Success	-	View
exp_self.20260419100400.528_20260419_100401 Paper: self.20260419100400.528	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419100400.528 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 10:05	Success	-	View
exp_pytrain.20260419100131.130_20260419_100131 Paper: pytrain.20260419100131.130	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 10:02	Success	-	View
exp_self.20260419095426.527_20260419_095426 Paper: self.20260419095426.527	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419095426.527 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 09:55	Success	-	View
exp_self.20260419094657.526_20260419_094657 Paper: self.20260419094657.526	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419094657.526 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 09:48	Success	-	View
exp_self.20260419093926.525_20260419_093927 Paper: self.20260419093926.525	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419093926.525 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 09:40	Success	-	View
exp_self.20260419093146.524_20260419_093147 Paper: self.20260419093146.524	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419093146.524 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 09:32	Success	-	View
exp_pytrain.20260419092914.129_20260419_092914 Paper: pytrain.20260419092914.129	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 09:30	Success	-	View
exp_self.20260419092211.523_20260419_092212 Paper: self.20260419092211.523	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419092211.523 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 09:23	Success	-	View
exp_self.20260419091441.522_20260419_091441 Paper: self.20260419091441.522	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419091441.522 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 09:15	Success	-	View
exp_self.20260419090713.521_20260419_090713 Paper: self.20260419090713.521	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419090713.521 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 09:08	Success	-	View
exp_self.20260419085938.520_20260419_085939 Paper: self.20260419085938.520	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419085938.520 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 09:00	Success	-	View
exp_pytrain.20260419085707.128_20260419_085707 Paper: pytrain.20260419085707.128	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 08:58	Success	-	View
exp_self.20260419085004.519_20260419_085005 Paper: self.20260419085004.519	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419085004.519 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 08:51	Success	-	View
exp_self.20260419084234.518_20260419_084234 Paper: self.20260419084234.518	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419084234.518 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 08:43	Success	-	View
exp_self.20260419083506.517_20260419_083507 Paper: self.20260419083506.517	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419083506.517 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 08:36	Success	-	View
exp_self.20260419082735.516_20260419_082735 Paper: self.20260419082735.516	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419082735.516 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 08:28	Success	-	View
exp_pytrain.20260419082504.127_20260419_082505 Paper: pytrain.20260419082504.127	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 08:26	Success	-	View
exp_self.20260419081810.515_20260419_081810 Paper: self.20260419081810.515	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419081810.515 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 08:19	Success	-	View
exp_self.20260419081039.514_20260419_081039 Paper: self.20260419081039.514	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419081039.514 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 08:11	Success	-	View
exp_self.20260419080309.513_20260419_080310 Paper: self.20260419080309.513	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419080309.513 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 08:04	Success	-	View
exp_self.20260419075537.512_20260419_075538 Paper: self.20260419075537.512	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419075537.512 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 07:56	Success	-	View
exp_pytrain.20260419075303.126_20260419_075303 Paper: pytrain.20260419075303.126	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 07:54	Success	-	View
exp_self.20260419074609.511_20260419_074609 Paper: self.20260419074609.511	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419074609.511 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 07:47	Success	-	View
exp_self.20260419073832.510_20260419_073833 Paper: self.20260419073832.510	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419073832.510 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 07:39	Success	-	View
exp_self.20260419073100.509_20260419_073100 Paper: self.20260419073100.509	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419073100.509 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 07:32	Success	-	View
exp_self.20260419072330.508_20260419_072330 Paper: self.20260419072330.508	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419072330.508 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 07:24	Success	-	View
exp_pytrain.20260419072052.125_20260419_072052 Paper: pytrain.20260419072052.125	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 07:21	Success	-	View
exp_self.20260419071353.507_20260419_071353 Paper: self.20260419071353.507	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419071353.507 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 07:14	Success	-	View
exp_self.20260419070619.506_20260419_070619 Paper: self.20260419070619.506	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419070619.506 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 07:07	Success	-	View
exp_self.20260419065842.505_20260419_065842 Paper: self.20260419065842.505	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419065842.505 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 06:59	Success	-	View
exp_self.20260419065108.504_20260419_065109 Paper: self.20260419065108.504	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419065108.504 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 06:52	Success	-	View
exp_pytrain.20260419064833.124_20260419_064834 Paper: pytrain.20260419064833.124	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 06:49	Success	-	View
exp_self.20260419064140.503_20260419_064140 Paper: self.20260419064140.503	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419064140.503 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 06:42	Success	-	View
exp_self.20260419063411.502_20260419_063411 Paper: self.20260419063411.502	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419063411.502 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 06:35	Success	-	View
exp_self.20260419062643.501_20260419_062643 Paper: self.20260419062643.501	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419062643.501 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 06:27	Success	-	View
exp_self.20260419061914.500_20260419_061914 Paper: self.20260419061914.500	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419061914.500 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 06:20	Success	-	View
exp_pytrain.20260419061641.123_20260419_061641 Paper: pytrain.20260419061641.123	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 06:17	Success	-	View
exp_self.20260419060944.499_20260419_060944 Paper: self.20260419060944.499	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419060944.499 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 06:10	Success	-	View
exp_self.20260419060213.498_20260419_060213 Paper: self.20260419060213.498	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419060213.498 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 06:03	Success	-	View
exp_self.20260419055446.497_20260419_055447 Paper: self.20260419055446.497	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419055446.497 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 05:55	Success	-	View
exp_self.20260419054718.496_20260419_054718 Paper: self.20260419054718.496	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419054718.496 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 05:48	Success	-	View
exp_pytrain.20260419054449.122_20260419_054450 Paper: pytrain.20260419054449.122	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 05:45	Success	-	View
exp_self.20260419053756.495_20260419_053756 Paper: self.20260419053756.495	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419053756.495 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 05:38	Success	-	View
exp_self.20260419053025.494_20260419_053025 Paper: self.20260419053025.494	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419053025.494 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 05:31	Success	-	View
exp_self.20260419052256.493_20260419_052256 Paper: self.20260419052256.493	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419052256.493 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 05:23	Success	-	View
exp_self.20260419051528.492_20260419_051529 Paper: self.20260419051528.492	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419051528.492 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 05:16	Success	-	View
exp_pytrain.20260419051303.121_20260419_051304 Paper: pytrain.20260419051303.121	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 05:14	Success	-	View
exp_self.20260419050738.491_20260419_050739 Paper: self.20260419050738.491	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419050738.491 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 05:08	Success	-	View
exp_self.20260419045945.490_20260419_045946 Paper: self.20260419045945.490	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419045945.490 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 05:00	Success	-	View
exp_self.20260419045146.489_20260419_045146 Paper: self.20260419045146.489	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419045146.489 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 04:52	Success	-	View
exp_self.20260419044412.488_20260419_044413 Paper: self.20260419044412.488	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419044412.488 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 04:45	Success	-	View
exp_pytrain.20260419044136.120_20260419_044137 Paper: pytrain.20260419044136.120	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 04:42	Success	-	View
exp_self.20260419043440.487_20260419_043441 Paper: self.20260419043440.487	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419043440.487 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 04:35	Success	-	View
exp_self.20260419042714.486_20260419_042714 Paper: self.20260419042714.486	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419042714.486 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 04:28	Success	-	View
exp_self.20260419041946.485_20260419_041946 Paper: self.20260419041946.485	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419041946.485 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 04:20	Success	-	View
exp_self.20260419041221.484_20260419_041221 Paper: self.20260419041221.484	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419041221.484 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 04:13	Success	-	View
exp_pytrain.20260419040945.119_20260419_040945 Paper: pytrain.20260419040945.119	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 04:10	Success	-	View
exp_self.20260419040253.483_20260419_040253 Paper: self.20260419040253.483	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419040253.483 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 04:03	Success	-	View
exp_self.20260419035518.482_20260419_035519 Paper: self.20260419035518.482	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419035518.482 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 03:56	Success	-	View
exp_self.20260419034751.481_20260419_034752 Paper: self.20260419034751.481	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419034751.481 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 03:48	Success	-	View
exp_self.20260419034029.480_20260419_034029 Paper: self.20260419034029.480	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419034029.480 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 03:41	Success	-	View
exp_pytrain.20260419033758.118_20260419_033759 Paper: pytrain.20260419033758.118	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 03:39	Success	-	View
exp_self.20260419033111.479_20260419_033111 Paper: self.20260419033111.479	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419033111.479 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 03:32	Success	-	View
exp_self.20260419032341.478_20260419_032342 Paper: self.20260419032341.478	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419032341.478 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 03:24	Success	-	View
exp_self.20260419031618.477_20260419_031618 Paper: self.20260419031618.477	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419031618.477 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 03:17	Success	-	View
exp_self.20260419030850.476_20260419_030850 Paper: self.20260419030850.476	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419030850.476 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 03:09	Success	-	View
exp_pytrain.20260419030620.117_20260419_030620 Paper: pytrain.20260419030620.117	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 03:07	Success	-	View
exp_self.20260419025927.475_20260419_025927 Paper: self.20260419025927.475	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419025927.475 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 03:00	Success	-	View
exp_self.20260419025204.474_20260419_025205 Paper: self.20260419025204.474	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419025204.474 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 02:53	Success	-	View
exp_self.20260419024434.473_20260419_024434 Paper: self.20260419024434.473	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419024434.473 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 02:45	Success	-	View
exp_self.20260419023709.472_20260419_023709 Paper: self.20260419023709.472	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419023709.472 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 02:38	Success	-	View
exp_pytrain.20260419023440.116_20260419_023441 Paper: pytrain.20260419023440.116	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 02:35	Success	-	View
exp_self.20260419022749.471_20260419_022749 Paper: self.20260419022749.471	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419022749.471 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 02:28	Success	-	View
exp_self.20260419022026.470_20260419_022026 Paper: self.20260419022026.470	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419022026.470 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 02:21	Success	-	View
exp_self.20260419021254.469_20260419_021254 Paper: self.20260419021254.469	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419021254.469 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 02:13	Success	-	View
exp_self.20260419020527.468_20260419_020528 Paper: self.20260419020527.468	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419020527.468 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 02:06	Success	-	View
exp_pytrain.20260419020301.115_20260419_020302 Paper: pytrain.20260419020301.115	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 02:04	Success	-	View
exp_self.20260419015600.467_20260419_015600 Paper: self.20260419015600.467	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419015600.467 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 01:57	Success	-	View
exp_self.20260419014834.466_20260419_014834 Paper: self.20260419014834.466	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419014834.466 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 01:49	Success	-	View
exp_self.20260419014110.465_20260419_014110 Paper: self.20260419014110.465	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419014110.465 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 01:42	Success	-	View
exp_self.20260419013340.464_20260419_013340 Paper: self.20260419013340.464	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419013340.464 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 01:34	Success	-	View
exp_pytrain.20260419013110.114_20260419_013110 Paper: pytrain.20260419013110.114	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 01:32	Success	-	View
exp_self.20260419012409.463_20260419_012409 Paper: self.20260419012409.463	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419012409.463 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 01:25	Success	-	View
exp_self.20260419011640.462_20260419_011641 Paper: self.20260419011640.462	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419011640.462 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 01:17	Success	-	View
exp_self.20260419010913.461_20260419_010913 Paper: self.20260419010913.461	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419010913.461 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 01:10	Success	-	View
exp_self.20260419010139.460_20260419_010140 Paper: self.20260419010139.460	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419010139.460 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 01:02	Success	-	View
exp_pytrain.20260419005910.113_20260419_005911 Paper: pytrain.20260419005910.113	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 01:00	Success	-	View
exp_self.20260419005211.459_20260419_005211 Paper: self.20260419005211.459	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419005211.459 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 00:53	Success	-	View
exp_self.20260419004447.458_20260419_004448 Paper: self.20260419004447.458	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419004447.458 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 00:45	Success	-	View
exp_self.20260419003719.457_20260419_003720 Paper: self.20260419003719.457	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419003719.457 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 00:38	Success	-	View
exp_self.20260419002939.456_20260419_002940 Paper: self.20260419002939.456	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419002939.456 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 00:30	Success	-	View
exp_pytrain.20260419002653.112_20260419_002654 Paper: pytrain.20260419002653.112	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-19 00:27	Success	-	View
exp_self.20260419001943.455_20260419_001943 Paper: self.20260419001943.455	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419001943.455 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 00:20	Success	-	View
exp_self.20260419001159.454_20260419_001159 Paper: self.20260419001159.454	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419001159.454 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 00:13	Success	-	View
exp_self.20260419000409.453_20260419_000410 Paper: self.20260419000409.453	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260419000409.453 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-19 00:05	Success	-	View
exp_cr_10.1108_compel-11-2025-0530_20260419_000037 Paper: cr_10.1108_compel-11-2025-0530	Analytical calculation model of eddy current loss of power transformer winding using method of images Paper ID: cr_10.1108_compel-11-2025-0530 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Rec...	04-19 00:01	Success	-	View
exp_self.20260418235718.452_20260418_235718 Paper: self.20260418235718.452	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418235718.452 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 23:58	Success	-	View
exp_pytrain.20260418235432.111_20260418_235432 Paper: pytrain.20260418235432.111	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 23:55	Success	-	View
exp_self.20260418234901.451_20260418_234902 Paper: self.20260418234901.451	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418234901.451 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 23:50	Success	-	View
exp_self.20260418234112.450_20260418_234113 Paper: self.20260418234112.450	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418234112.450 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 23:42	Success	-	View
exp_self.20260418233322.449_20260418_233322 Paper: self.20260418233322.449	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418233322.449 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 23:34	Success	-	View
exp_self.20260418232526.448_20260418_232526 Paper: self.20260418232526.448	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418232526.448 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 23:26	Success	-	View
exp_pytrain.20260418232248.110_20260418_232248 Paper: pytrain.20260418232248.110	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 23:23	Success	-	View
exp_self.20260418231640.447_20260418_231640 Paper: self.20260418231640.447	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418231640.447 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 23:17	Success	-	View
exp_self.20260418230849.446_20260418_230850 Paper: self.20260418230849.446	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418230849.446 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 23:09	Success	-	View
exp_self.20260418230100.445_20260418_230101 Paper: self.20260418230100.445	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418230100.445 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 23:02	Success	-	View
exp_self.20260418225320.444_20260418_225320 Paper: self.20260418225320.444	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418225320.444 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 22:54	Success	-	View
exp_pytrain.20260418225027.109_20260418_225027 Paper: pytrain.20260418225027.109	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 22:51	Success	-	View
exp_self.20260418224453.443_20260418_224454 Paper: self.20260418224453.443	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418224453.443 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 22:45	Success	-	View
exp_self.20260418223705.442_20260418_223705 Paper: self.20260418223705.442	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418223705.442 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 22:38	Success	-	View
exp_self.20260418222917.441_20260418_222917 Paper: self.20260418222917.441	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418222917.441 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 22:30	Success	-	View
exp_self.20260418222129.440_20260418_222130 Paper: self.20260418222129.440	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418222129.440 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 22:22	Success	-	View
exp_pytrain.20260418221849.108_20260418_221849 Paper: pytrain.20260418221849.108	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 22:19	Success	-	View
exp_self.20260418221313.439_20260418_221313 Paper: self.20260418221313.439	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418221313.439 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 22:14	Success	-	View
exp_self.20260418220533.438_20260418_220533 Paper: self.20260418220533.438	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418220533.438 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 22:06	Success	-	View
exp_self.20260418215744.437_20260418_215744 Paper: self.20260418215744.437	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418215744.437 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 21:58	Success	-	View
exp_self.20260418215003.436_20260418_215003 Paper: self.20260418215003.436	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418215003.436 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 21:51	Success	-	View
exp_pytrain.20260418214716.107_20260418_214716 Paper: pytrain.20260418214716.107	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 21:48	Success	-	View
exp_self.20260418214140.435_20260418_214141 Paper: self.20260418214140.435	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418214140.435 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 21:42	Success	-	View
exp_self.20260418213358.434_20260418_213358 Paper: self.20260418213358.434	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418213358.434 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 21:35	Success	-	View
exp_self.20260418212618.433_20260418_212618 Paper: self.20260418212618.433	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418212618.433 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 21:27	Success	-	View
exp_self.20260418211827.432_20260418_211827 Paper: self.20260418211827.432	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418211827.432 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 21:19	Success	-	View
exp_pytrain.20260418211549.106_20260418_211550 Paper: pytrain.20260418211549.106	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 21:16	Success	-	View
exp_self.20260418210835.431_20260418_210835 Paper: self.20260418210835.431	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418210835.431 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 21:09	Success	-	View
exp_self.20260418210141.430_20260418_210142 Paper: self.20260418210141.430	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418210141.430 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 21:02	Success	-	View
exp_self.20260418205350.429_20260418_205350 Paper: self.20260418205350.429	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418205350.429 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 20:54	Success	-	View
exp_self.20260418204558.428_20260418_204558 Paper: self.20260418204558.428	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418204558.428 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 20:47	Success	-	View
exp_pytrain.20260418204316.105_20260418_204316 Paper: pytrain.20260418204316.105	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 20:44	Success	-	View
exp_self.20260418203703.427_20260418_203703 Paper: self.20260418203703.427	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418203703.427 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 20:38	Success	-	View
exp_self.20260418202911.426_20260418_202911 Paper: self.20260418202911.426	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418202911.426 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 20:30	Success	-	View
exp_self.20260418202118.425_20260418_202119 Paper: self.20260418202118.425	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418202118.425 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 20:22	Success	-	View
exp_self.20260418201328.424_20260418_201329 Paper: self.20260418201328.424	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418201328.424 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 20:14	Success	-	View
exp_pytrain.20260418201046.104_20260418_201046 Paper: pytrain.20260418201046.104	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 20:11	Success	-	View
exp_self.20260418200329.423_20260418_200329 Paper: self.20260418200329.423	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418200329.423 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 20:04	Success	-	View
exp_gh_donitb934_1Cat-vLLM_20260418_200004 Paper: gh_donitb934_1Cat-vLLM	donitb934/1Cat-vLLM Paper ID: gh_donitb934_1Cat-vLLM - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	04-18 20:01	Success	-	View
exp_self.20260418195635.422_20260418_195635 Paper: self.20260418195635.422	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418195635.422 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 19:57	Success	-	View
exp_self.20260418194852.421_20260418_194853 Paper: self.20260418194852.421	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418194852.421 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 19:49	Success	-	View
exp_self.20260418194119.420_20260418_194120 Paper: self.20260418194119.420	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418194119.420 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 19:42	Success	-	View
exp_pytrain.20260418193847.103_20260418_193848 Paper: pytrain.20260418193847.103	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 19:39	Success	-	View
exp_self.20260418193146.419_20260418_193146 Paper: self.20260418193146.419	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418193146.419 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 19:32	Success	-	View
exp_self.20260418192416.418_20260418_192416 Paper: self.20260418192416.418	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418192416.418 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 19:25	Success	-	View
exp_self.20260418191644.417_20260418_191645 Paper: self.20260418191644.417	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418191644.417 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 19:17	Success	-	View
exp_self.20260418190914.416_20260418_190914 Paper: self.20260418190914.416	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418190914.416 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 19:10	Success	-	View
exp_pytrain.20260418190636.102_20260418_190637 Paper: pytrain.20260418190636.102	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 19:07	Success	-	View
exp_self.20260418185942.415_20260418_185942 Paper: self.20260418185942.415	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418185942.415 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 19:00	Success	-	View
exp_self.20260418185209.414_20260418_185210 Paper: self.20260418185209.414	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418185209.414 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 18:53	Success	-	View
exp_self.20260418184436.413_20260418_184436 Paper: self.20260418184436.413	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418184436.413 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 18:45	Success	-	View
exp_self.20260418183707.412_20260418_183707 Paper: self.20260418183707.412	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418183707.412 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 18:38	Success	-	View
exp_pytrain.20260418183429.101_20260418_183430 Paper: pytrain.20260418183429.101	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 18:35	Success	-	View
exp_self.20260418182725.411_20260418_182726 Paper: self.20260418182725.411	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418182725.411 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 18:28	Success	-	View
exp_self.20260418181954.410_20260418_181955 Paper: self.20260418181954.410	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418181954.410 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 18:20	Success	-	View
exp_self.20260418181224.409_20260418_181224 Paper: self.20260418181224.409	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418181224.409 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 18:13	Success	-	View
exp_self.20260418180458.408_20260418_180458 Paper: self.20260418180458.408	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418180458.408 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 18:06	Success	-	View
exp_pytrain.20260418180223.100_20260418_180224 Paper: pytrain.20260418180223.100	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 18:03	Success	-	View
exp_self.20260418175805.407_20260418_175806 Paper: self.20260418175805.407	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418175805.407 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 17:59	Success	-	View
exp_self.20260418175032.406_20260418_175033 Paper: self.20260418175032.406	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418175032.406 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 17:51	Success	-	View
exp_cr_10.32628_ijsrst52310283_20260418_174742 Paper: cr_10.32628_ijsrst52310283	Enhancing Transformer Attention Mechanisms for Knowledge Retention in Fine-Tuned Large Language Models Paper ID: cr_10.32628_ijsrst52310283 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recover...	04-18 17:48	Success	-	View
exp_self.20260418174041.405_20260418_174041 Paper: self.20260418174041.405	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418174041.405 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 17:41	Success	-	View
exp_self.20260418173309.404_20260418_173309 Paper: self.20260418173309.404	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418173309.404 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 17:34	Success	-	View
exp_pytrain.20260418173035.099_20260418_173035 Paper: pytrain.20260418173035.099	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 17:31	Success	-	View
exp_self.20260418172329.403_20260418_172329 Paper: self.20260418172329.403	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418172329.403 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 17:24	Success	-	View
exp_self.20260418171601.402_20260418_171601 Paper: self.20260418171601.402	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418171601.402 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 17:17	Success	-	View
exp_self.20260418170833.401_20260418_170833 Paper: self.20260418170833.401	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418170833.401 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 17:09	Success	-	View
exp_self.20260418170053.400_20260418_170054 Paper: self.20260418170053.400	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418170053.400 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 17:01	Success	-	View
exp_pytrain.20260418165817.098_20260418_165818 Paper: pytrain.20260418165817.098	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 16:59	Success	-	View
exp_self.20260418165124.399_20260418_165125 Paper: self.20260418165124.399	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418165124.399 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 16:52	Success	-	View
exp_self.20260418164358.398_20260418_164358 Paper: self.20260418164358.398	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418164358.398 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 16:45	Success	-	View
exp_self.20260418163631.397_20260418_163631 Paper: self.20260418163631.397	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418163631.397 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 16:37	Success	-	View
exp_self.20260418162906.396_20260418_162907 Paper: self.20260418162906.396	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418162906.396 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 16:30	Success	-	View
exp_pytrain.20260418162635.097_20260418_162635 Paper: pytrain.20260418162635.097	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 16:27	Success	-	View
exp_self.20260418161943.395_20260418_161943 Paper: self.20260418161943.395	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418161943.395 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 16:20	Success	-	View
exp_self.20260418161218.394_20260418_161218 Paper: self.20260418161218.394	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418161218.394 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 16:13	Success	-	View
exp_self.20260418160446.393_20260418_160447 Paper: self.20260418160446.393	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418160446.393 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 16:05	Success	-	View
exp_self.20260418155720.392_20260418_155720 Paper: self.20260418155720.392	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418155720.392 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 15:58	Success	-	View
exp_pytrain.20260418155443.096_20260418_155444 Paper: pytrain.20260418155443.096	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 15:55	Success	-	View
exp_self.20260418154749.391_20260418_154750 Paper: self.20260418154749.391	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418154749.391 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 15:48	Success	-	View
exp_self.20260418154018.390_20260418_154018 Paper: self.20260418154018.390	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418154018.390 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 15:41	Success	-	View
exp_self.20260418153250.389_20260418_153251 Paper: self.20260418153250.389	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418153250.389 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 15:33	Success	-	View
exp_self.20260418152523.388_20260418_152524 Paper: self.20260418152523.388	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418152523.388 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 15:26	Success	-	View
exp_pytrain.20260418152251.095_20260418_152251 Paper: pytrain.20260418152251.095	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 15:23	Success	-	View
exp_self.20260418151559.387_20260418_151600 Paper: self.20260418151559.387	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418151559.387 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 15:17	Success	-	View
exp_self.20260418150819.386_20260418_150820 Paper: self.20260418150819.386	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418150819.386 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 15:09	Success	-	View
exp_gh_Bhavesh716_LLM-from-Scratch_20260418_150500 Paper: gh_Bhavesh716_LLM-from-Scratch	Bhavesh716/LLM-from-Scratch Paper ID: gh_Bhavesh716_LLM-from-Scratch - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Rec...	04-18 15:06	Success	-	View
exp_self.20260418150033.385_20260418_150033 Paper: self.20260418150033.385	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418150033.385 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 15:01	Success	-	View
exp_self.20260418145301.384_20260418_145301 Paper: self.20260418145301.384	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418145301.384 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 14:54	Success	-	View
exp_pytrain.20260418145033.094_20260418_145033 Paper: pytrain.20260418145033.094	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 14:51	Success	-	View
exp_self.20260418144331.383_20260418_144331 Paper: self.20260418144331.383	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418144331.383 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 14:44	Success	-	View
exp_self.20260418143605.382_20260418_143606 Paper: self.20260418143605.382	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418143605.382 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 14:37	Success	-	View
exp_self.20260418142840.381_20260418_142840 Paper: self.20260418142840.381	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418142840.381 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 14:29	Success	-	View
exp_self.20260418142110.380_20260418_142110 Paper: self.20260418142110.380	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418142110.380 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 14:22	Success	-	View
exp_pytrain.20260418141834.093_20260418_141834 Paper: pytrain.20260418141834.093	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 14:19	Success	-	View
exp_self.20260418141142.379_20260418_141142 Paper: self.20260418141142.379	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418141142.379 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 14:12	Success	-	View
exp_self.20260418140409.378_20260418_140409 Paper: self.20260418140409.378	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418140409.378 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 14:05	Success	-	View
exp_self.20260418135637.377_20260418_135637 Paper: self.20260418135637.377	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418135637.377 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 13:57	Success	-	View
exp_self.20260418134905.376_20260418_134905 Paper: self.20260418134905.376	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418134905.376 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 13:50	Success	-	View
exp_pytrain.20260418134627.092_20260418_134627 Paper: pytrain.20260418134627.092	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 13:47	Success	-	View
exp_self.20260418133933.375_20260418_133934 Paper: self.20260418133933.375	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418133933.375 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 13:40	Success	-	View
exp_self.20260418133202.374_20260418_133202 Paper: self.20260418133202.374	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418133202.374 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 13:33	Success	-	View
exp_self.20260418132433.373_20260418_132433 Paper: self.20260418132433.373	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418132433.373 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 13:25	Success	-	View
exp_self.20260418131709.372_20260418_131710 Paper: self.20260418131709.372	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418131709.372 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 13:18	Success	-	View
exp_pytrain.20260418131433.091_20260418_131434 Paper: pytrain.20260418131433.091	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 13:15	Success	-	View
exp_self.20260418130742.371_20260418_130742 Paper: self.20260418130742.371	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418130742.371 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 13:08	Success	-	View
exp_self.20260418130013.370_20260418_130013 Paper: self.20260418130013.370	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418130013.370 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 13:01	Success	-	View
exp_self.20260418125241.369_20260418_125241 Paper: self.20260418125241.369	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418125241.369 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 12:53	Success	-	View
exp_self.20260418124513.368_20260418_124514 Paper: self.20260418124513.368	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418124513.368 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 12:46	Success	-	View
exp_pytrain.20260418124240.090_20260418_124241 Paper: pytrain.20260418124240.090	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 12:43	Success	-	View
exp_self.20260418123550.367_20260418_123550 Paper: self.20260418123550.367	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418123550.367 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 12:36	Success	-	View
exp_self.20260418122819.366_20260418_122819 Paper: self.20260418122819.366	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418122819.366 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 12:29	Success	-	View
exp_self.20260418122023.365_20260418_122024 Paper: self.20260418122023.365	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418122023.365 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 12:21	Success	-	View
exp_self.20260418121255.364_20260418_121256 Paper: self.20260418121255.364	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418121255.364 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 12:13	Success	-	View
exp_pytrain.20260418121023.089_20260418_121023 Paper: pytrain.20260418121023.089	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 12:11	Success	-	View
exp_self.20260418120334.363_20260418_120334 Paper: self.20260418120334.363	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418120334.363 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 12:04	Success	-	View
exp_self.20260418115616.362_20260418_115616 Paper: self.20260418115616.362	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418115616.362 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 11:57	Success	-	View
exp_self.20260418114832.361_20260418_114832 Paper: self.20260418114832.361	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418114832.361 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 11:49	Success	-	View
exp_self.20260418114040.360_20260418_114041 Paper: self.20260418114040.360	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418114040.360 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 11:41	Success	-	View
exp_pytrain.20260418113759.088_20260418_113759 Paper: pytrain.20260418113759.088	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 11:39	Success	-	View
exp_self.20260418113151.359_20260418_113152 Paper: self.20260418113151.359	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418113151.359 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 11:32	Success	-	View
exp_self.20260418112407.358_20260418_112408 Paper: self.20260418112407.358	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418112407.358 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 11:25	Success	-	View
exp_self.20260418111624.357_20260418_111624 Paper: self.20260418111624.357	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418111624.357 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 11:17	Success	-	View
exp_self.20260418110836.356_20260418_110837 Paper: self.20260418110836.356	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418110836.356 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 11:09	Success	-	View
exp_pytrain.20260418110550.087_20260418_110550 Paper: pytrain.20260418110550.087	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 11:06	Success	-	View
exp_self.20260418110023.355_20260418_110023 Paper: self.20260418110023.355	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418110023.355 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 11:01	Success	-	View
exp_self.20260418105241.354_20260418_105241 Paper: self.20260418105241.354	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418105241.354 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 10:53	Success	-	View
exp_self.20260418104449.353_20260418_104449 Paper: self.20260418104449.353	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418104449.353 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 10:45	Success	-	View
exp_self.20260418103650.352_20260418_103650 Paper: self.20260418103650.352	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418103650.352 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 10:37	Success	-	View
exp_pytrain.20260418103415.086_20260418_103415 Paper: pytrain.20260418103415.086	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 10:35	Success	-	View
exp_self.20260418102656.351_20260418_102657 Paper: self.20260418102656.351	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418102656.351 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 10:28	Success	-	View
exp_self.20260418101926.350_20260418_101926 Paper: self.20260418101926.350	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418101926.350 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 10:20	Success	-	View
exp_self.20260418101151.349_20260418_101151 Paper: self.20260418101151.349	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418101151.349 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 10:12	Success	-	View
exp_self.20260418100420.348_20260418_100420 Paper: self.20260418100420.348	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418100420.348 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 10:05	Success	-	View
exp_pytrain.20260418100151.085_20260418_100151 Paper: pytrain.20260418100151.085	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 10:02	Success	-	View
exp_self.20260418095444.347_20260418_095444 Paper: self.20260418095444.347	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418095444.347 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 09:55	Success	-	View
exp_self.20260418094705.346_20260418_094705 Paper: self.20260418094705.346	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418094705.346 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 09:48	Success	-	View
exp_self.20260418093934.345_20260418_093935 Paper: self.20260418093934.345	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418093934.345 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 09:40	Success	-	View
exp_self.20260418093148.344_20260418_093148 Paper: self.20260418093148.344	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418093148.344 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 09:32	Success	-	View
exp_pytrain.20260418092909.084_20260418_092909 Paper: pytrain.20260418092909.084	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 09:30	Success	-	View
exp_self.20260418092445.343_20260418_092445 Paper: self.20260418092445.343	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418092445.343 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 09:25	Success	-	View
exp_self.20260418091718.342_20260418_091719 Paper: self.20260418091718.342	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418091718.342 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 09:18	Success	-	View
exp_self.20260418090940.341_20260418_090940 Paper: self.20260418090940.341	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418090940.341 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 09:10	Success	-	View
exp_self.20260418090201.340_20260418_090201 Paper: self.20260418090201.340	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418090201.340 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 09:03	Success	-	View
exp_gh_Sidgithub18_mlbuild_20260418_085912 Paper: gh_Sidgithub18_mlbuild	Sidgithub18/mlbuild Paper ID: gh_Sidgithub18_mlbuild - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	04-18 09:00	Success	-	View
exp_pytrain.20260418085654.083_20260418_085654 Paper: pytrain.20260418085654.083	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 08:57	Success	-	View
exp_self.20260418085002.339_20260418_085003 Paper: self.20260418085002.339	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418085002.339 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 08:51	Success	-	View
exp_self.20260418084227.338_20260418_084227 Paper: self.20260418084227.338	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418084227.338 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 08:43	Success	-	View
exp_self.20260418083455.337_20260418_083455 Paper: self.20260418083455.337	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418083455.337 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 08:35	Success	-	View
exp_self.20260418082727.336_20260418_082727 Paper: self.20260418082727.336	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418082727.336 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 08:28	Success	-	View
exp_pytrain.20260418082454.082_20260418_082455 Paper: pytrain.20260418082454.082	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 08:25	Success	-	View
exp_self.20260418081800.335_20260418_081800 Paper: self.20260418081800.335	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418081800.335 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 08:19	Success	-	View
exp_self.20260418081036.334_20260418_081036 Paper: self.20260418081036.334	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418081036.334 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 08:11	Success	-	View
exp_self.20260418080254.333_20260418_080254 Paper: self.20260418080254.333	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418080254.333 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 08:03	Success	-	View
exp_self.20260418075509.332_20260418_075509 Paper: self.20260418075509.332	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418075509.332 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 07:56	Success	-	View
exp_pytrain.20260418075219.081_20260418_075220 Paper: pytrain.20260418075219.081	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 07:53	Success	-	View
exp_self.20260418074531.331_20260418_074531 Paper: self.20260418074531.331	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418074531.331 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 07:46	Success	-	View
exp_self.20260418073806.330_20260418_073806 Paper: self.20260418073806.330	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418073806.330 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 07:39	Success	-	View
exp_self.20260418073031.329_20260418_073031 Paper: self.20260418073031.329	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418073031.329 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 07:31	Success	-	View
exp_self.20260418072304.328_20260418_072305 Paper: self.20260418072304.328	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418072304.328 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 07:24	Success	-	View
exp_pytrain.20260418072041.080_20260418_072041 Paper: pytrain.20260418072041.080	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 07:21	Success	-	View
exp_self.20260418071328.327_20260418_071329 Paper: self.20260418071328.327	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418071328.327 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 07:14	Success	-	View
exp_self.20260418070557.326_20260418_070557 Paper: self.20260418070557.326	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418070557.326 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 07:07	Success	-	View
exp_gh_maple3788_RAG_Lab_20260418_070128 Paper: gh_maple3788_RAG_Lab	maple3788/RAG_Lab Paper ID: gh_maple3788_RAG_Lab - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered ben...	04-18 07:02	Success	-	View
exp_self.20260418065812.325_20260418_065812 Paper: self.20260418065812.325	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418065812.325 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 06:59	Success	-	View
exp_self.20260418065058.324_20260418_065059 Paper: self.20260418065058.324	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418065058.324 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 06:52	Success	-	View
exp_pytrain.20260418064817.079_20260418_064817 Paper: pytrain.20260418064817.079	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 06:49	Success	-	View
exp_self.20260418064138.323_20260418_064138 Paper: self.20260418064138.323	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418064138.323 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 06:42	Success	-	View
exp_self.20260418063422.322_20260418_063422 Paper: self.20260418063422.322	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418063422.322 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 06:35	Success	-	View
exp_self.20260418062706.321_20260418_062706 Paper: self.20260418062706.321	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418062706.321 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 06:28	Success	-	View
exp_self.20260418061954.320_20260418_061954 Paper: self.20260418061954.320	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418061954.320 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 06:20	Success	-	View
exp_pytrain.20260418061627.078_20260418_061628 Paper: pytrain.20260418061627.078	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 06:17	Success	-	View
exp_self.20260418061224.319_20260418_061224 Paper: self.20260418061224.319	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418061224.319 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 06:13	Success	-	View
exp_self.20260418060513.318_20260418_060513 Paper: self.20260418060513.318	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418060513.318 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 06:06	Success	-	View
exp_self.20260418055800.317_20260418_055800 Paper: self.20260418055800.317	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418055800.317 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 05:59	Success	-	View
exp_self.20260418055042.316_20260418_055043 Paper: self.20260418055042.316	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418055042.316 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 05:51	Success	-	View
exp_pytrain.20260418054506.077_20260418_054506 Paper: pytrain.20260418054506.077	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 05:46	Success	-	View
exp_self.20260418054318.315_20260418_054318 Paper: self.20260418054318.315	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418054318.315 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 05:44	Success	-	View
exp_self.20260418053600.314_20260418_053600 Paper: self.20260418053600.314	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418053600.314 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 05:37	Success	-	View
exp_self.20260418052844.313_20260418_052844 Paper: self.20260418052844.313	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418052844.313 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 05:29	Success	-	View
exp_self.20260418052132.312_20260418_052132 Paper: self.20260418052132.312	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418052132.312 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 05:22	Success	-	View
exp_gh_hussin2323332_slrm-lumin-fusion_20260418_051826 Paper: gh_hussin2323332_slrm-lumin-fusion	hussin2323332/slrm-lumin-fusion Paper ID: gh_hussin2323332_slrm-lumin-fusion - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal:...	04-18 05:19	Success	-	View
exp_self.20260418051413.311_20260418_051413 Paper: self.20260418051413.311	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418051413.311 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 05:15	Success	-	View
exp_pytrain.20260418051159.076_20260418_051159 Paper: pytrain.20260418051159.076	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 05:13	Success	-	View
exp_gh_mzuhair9933_PoPE-pytorch_20260418_050920 Paper: gh_mzuhair9933_PoPE-pytorch	mzuhair9933/PoPE-pytorch Paper ID: gh_mzuhair9933_PoPE-pytorch - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recove...	04-18 05:10	Success	-	View
exp_self.20260418050510.310_20260418_050511 Paper: self.20260418050510.310	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418050510.310 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 05:06	Success	-	View
exp_self.20260418045757.309_20260418_045757 Paper: self.20260418045757.309	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418045757.309 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 04:58	Success	-	View
exp_self.20260418045045.308_20260418_045046 Paper: self.20260418045045.308	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418045045.308 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 04:51	Success	-	View
exp_self.20260418044336.307_20260418_044336 Paper: self.20260418044336.307	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418044336.307 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 04:44	Success	-	View
exp_pytrain.20260418044010.075_20260418_044010 Paper: pytrain.20260418044010.075	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 04:41	Success	-	View
exp_self.20260418043607.306_20260418_043608 Paper: self.20260418043607.306	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418043607.306 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 04:37	Success	-	View
exp_self.20260418042856.305_20260418_042857 Paper: self.20260418042856.305	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418042856.305 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 04:29	Success	-	View
exp_self.20260418042142.304_20260418_042143 Paper: self.20260418042142.304	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418042142.304 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 04:22	Success	-	View
exp_self.20260418041419.303_20260418_041420 Paper: self.20260418041419.303	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418041419.303 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 04:15	Success	-	View
exp_pytrain.20260418040844.074_20260418_040844 Paper: pytrain.20260418040844.074	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 04:09	Success	-	View
exp_self.20260418040656.302_20260418_040656 Paper: self.20260418040656.302	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418040656.302 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 04:07	Success	-	View
exp_self.20260418035937.301_20260418_035937 Paper: self.20260418035937.301	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418035937.301 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 04:00	Success	-	View
exp_self.20260418035225.300_20260418_035225 Paper: self.20260418035225.300	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418035225.300 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 03:53	Success	-	View
exp_self.20260418034514.299_20260418_034514 Paper: self.20260418034514.299	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418034514.299 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 03:46	Success	-	View
exp_self.20260418033802.298_20260418_033802 Paper: self.20260418033802.298	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418033802.298 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 03:39	Success	-	View
exp_pytrain.20260418033540.073_20260418_033540 Paper: pytrain.20260418033540.073	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 03:36	Success	-	View
exp_oa_W7154587199_20260418_033300 Paper: oa_W7154587199	Mapping the LLM Landscape: A Cross-Family Survey of Architectures, Alignment Methods, and Benchmark Performance Paper ID: oa_W7154587199 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-18 03:34	Success	-	View
exp_self.20260418032743.297_20260418_032743 Paper: self.20260418032743.297	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418032743.297 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 03:28	Success	-	View
exp_self.20260418032024.296_20260418_032025 Paper: self.20260418032024.296	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418032024.296 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 03:21	Success	-	View
exp_self.20260418031311.295_20260418_031311 Paper: self.20260418031311.295	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418031311.295 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 03:14	Success	-	View
exp_self.20260418030601.294_20260418_030601 Paper: self.20260418030601.294	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418030601.294 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 03:07	Success	-	View
exp_pytrain.20260418030235.072_20260418_030235 Paper: pytrain.20260418030235.072	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 03:03	Success	-	View
exp_self.20260418025829.293_20260418_025829 Paper: self.20260418025829.293	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418025829.293 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 02:59	Success	-	View
exp_self.20260418025118.292_20260418_025118 Paper: self.20260418025118.292	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418025118.292 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 02:52	Success	-	View
exp_self.20260418024406.291_20260418_024406 Paper: self.20260418024406.291	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418024406.291 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 02:45	Success	-	View
exp_self.20260418023649.290_20260418_023649 Paper: self.20260418023649.290	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418023649.290 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 02:37	Success	-	View
exp_pytrain.20260418023112.071_20260418_023112 Paper: pytrain.20260418023112.071	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 02:32	Success	-	View
exp_self.20260418022924.289_20260418_022924 Paper: self.20260418022924.289	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418022924.289 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 02:30	Success	-	View
exp_self.20260418022207.288_20260418_022208 Paper: self.20260418022207.288	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418022207.288 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 02:23	Success	-	View
exp_self.20260418021453.287_20260418_021453 Paper: self.20260418021453.287	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418021453.287 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 02:15	Success	-	View
exp_self.20260418020741.286_20260418_020742 Paper: self.20260418020741.286	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418020741.286 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 02:08	Success	-	View
exp_self.20260418020027.285_20260418_020028 Paper: self.20260418020027.285	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418020027.285 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 02:01	Success	-	View
exp_pytrain.20260418015804.070_20260418_015805 Paper: pytrain.20260418015804.070	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 01:59	Success	-	View
exp_self.20260418015125.284_20260418_015125 Paper: self.20260418015125.284	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418015125.284 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 01:52	Success	-	View
exp_self.20260418014409.283_20260418_014410 Paper: self.20260418014409.283	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418014409.283 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 01:45	Success	-	View
exp_self.20260418013657.282_20260418_013657 Paper: self.20260418013657.282	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418013657.282 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 01:37	Success	-	View
exp_self.20260418012946.281_20260418_012946 Paper: self.20260418012946.281	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418012946.281 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 01:30	Success	-	View
exp_pytrain.20260418012619.069_20260418_012620 Paper: pytrain.20260418012619.069	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 01:27	Success	-	View
exp_self.20260418012216.280_20260418_012216 Paper: self.20260418012216.280	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418012216.280 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 01:23	Success	-	View
exp_self.20260418011505.279_20260418_011506 Paper: self.20260418011505.279	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418011505.279 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 01:16	Success	-	View
exp_self.20260418010753.278_20260418_010753 Paper: self.20260418010753.278	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418010753.278 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 01:08	Success	-	View
exp_self.20260418010038.277_20260418_010039 Paper: self.20260418010038.277	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418010038.277 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 01:01	Success	-	View
exp_gh_n24q02m_qwen3-embed_20260418_005755 Paper: gh_n24q02m_qwen3-embed	n24q02m/qwen3-embed Paper ID: gh_n24q02m_qwen3-embed - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	04-18 00:58	Success	-	View
exp_pytrain.20260418005454.068_20260418_005454 Paper: pytrain.20260418005454.068	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 00:55	Success	-	View
exp_self.20260418005052.276_20260418_005053 Paper: self.20260418005052.276	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418005052.276 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 00:51	Success	-	View
exp_self.20260418004338.275_20260418_004338 Paper: self.20260418004338.275	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418004338.275 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 00:44	Success	-	View
exp_self.20260418003624.274_20260418_003624 Paper: self.20260418003624.274	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418003624.274 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 00:37	Success	-	View
exp_self.20260418002908.273_20260418_002909 Paper: self.20260418002908.273	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418002908.273 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 00:30	Success	-	View
exp_pytrain.20260418002335.067_20260418_002335 Paper: pytrain.20260418002335.067	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-18 00:24	Success	-	View
exp_self.20260418002146.272_20260418_002147 Paper: self.20260418002146.272	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418002146.272 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 00:22	Success	-	View
exp_self.20260418001428.271_20260418_001428 Paper: self.20260418001428.271	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418001428.271 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 00:15	Success	-	View
exp_self.20260418000712.270_20260418_000712 Paper: self.20260418000712.270	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418000712.270 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 00:08	Success	-	View
exp_self.20260418000001.269_20260418_000002 Paper: self.20260418000001.269	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260418000001.269 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-18 00:01	Success	-	View
exp_self.20260417235252.268_20260417_235252 Paper: self.20260417235252.268	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417235252.268 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 23:53	Success	-	View
exp_pytrain.20260417235030.066_20260417_235031 Paper: pytrain.20260417235030.066	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 23:51	Success	-	View
exp_self.20260417234521.267_20260417_234521 Paper: self.20260417234521.267	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417234521.267 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 23:46	Success	-	View
exp_gh_reissuerenewal84_moe-compress_20260417_234001 Paper: gh_reissuerenewal84_moe-compress	reissuerenewal84/moe-compress Paper ID: gh_reissuerenewal84_moe-compress - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: R...	04-17 23:41	Success	-	View
exp_self.20260417233802.266_20260417_233803 Paper: self.20260417233802.266	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417233802.266 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 23:39	Success	-	View
exp_self.20260417233051.265_20260417_233051 Paper: self.20260417233051.265	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417233051.265 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 23:31	Success	-	View
exp_gh_lakshgk_distill_20260417_232810 Paper: gh_lakshgk_distill	lakshgk/distill Paper ID: gh_lakshgk_distill - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered bench...	04-17 23:29	Success	-	View
exp_self.20260417232116.264_20260417_232116 Paper: self.20260417232116.264	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417232116.264 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 23:22	Success	-	View
exp_pytrain.20260417231858.065_20260417_231858 Paper: pytrain.20260417231858.065	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 23:20	Success	-	View
exp_self.20260417231134.263_20260417_231134 Paper: self.20260417231134.263	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417231134.263 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 23:12	Success	-	View
exp_self.20260417230409.262_20260417_230409 Paper: self.20260417230409.262	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417230409.262 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 23:05	Success	-	View
exp_self.20260417225644.261_20260417_225645 Paper: self.20260417225644.261	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417225644.261 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 22:57	Success	-	View
exp_self.20260417224900.260_20260417_224901 Paper: self.20260417224900.260	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417224900.260 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 22:50	Success	-	View
exp_pytrain.20260417224631.064_20260417_224631 Paper: pytrain.20260417224631.064	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 22:47	Success	-	View
exp_self.20260417223940.259_20260417_223941 Paper: self.20260417223940.259	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417223940.259 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 22:40	Success	-	View
exp_self.20260417223219.258_20260417_223219 Paper: self.20260417223219.258	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417223219.258 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 22:33	Success	-	View
exp_self.20260417222456.257_20260417_222457 Paper: self.20260417222456.257	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417222456.257 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 22:25	Success	-	View
exp_self.20260417221734.256_20260417_221735 Paper: self.20260417221734.256	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417221734.256 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 22:18	Success	-	View
exp_pytrain.20260417221504.063_20260417_221504 Paper: pytrain.20260417221504.063	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 22:16	Success	-	View
exp_self.20260417220817.255_20260417_220817 Paper: self.20260417220817.255	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417220817.255 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 22:09	Success	-	View
exp_self.20260417220049.254_20260417_220049 Paper: self.20260417220049.254	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417220049.254 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 22:01	Success	-	View
exp_self.20260417215324.253_20260417_215325 Paper: self.20260417215324.253	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417215324.253 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 21:54	Success	-	View
exp_self.20260417214601.252_20260417_214601 Paper: self.20260417214601.252	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417214601.252 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 21:47	Success	-	View
exp_pytrain.20260417214327.062_20260417_214327 Paper: pytrain.20260417214327.062	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 21:44	Success	-	View
exp_self.20260417213636.251_20260417_213636 Paper: self.20260417213636.251	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417213636.251 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 21:37	Success	-	View
exp_self.20260417212909.250_20260417_212909 Paper: self.20260417212909.250	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417212909.250 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 21:30	Success	-	View
exp_gh_sanjeev-ragunathan_evolution-of-ml_20260417_212341 Paper: gh_sanjeev-ragunathan_evolution-of-ml	sanjeev-ragunathan/evolution-of-ml Paper ID: gh_sanjeev-ragunathan_evolution-of-ml - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Sign...	04-17 21:24	Success	-	View
exp_self.20260417212128.249_20260417_212129 Paper: self.20260417212128.249	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417212128.249 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 21:22	Success	-	View
exp_self.20260417211351.248_20260417_211351 Paper: self.20260417211351.248	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417211351.248 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 21:14	Success	-	View
exp_pytrain.20260417211115.061_20260417_211116 Paper: pytrain.20260417211115.061	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 21:12	Success	-	View
exp_self.20260417210408.247_20260417_210408 Paper: self.20260417210408.247	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417210408.247 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 21:05	Success	-	View
exp_self.20260417205647.246_20260417_205648 Paper: self.20260417205647.246	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417205647.246 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 20:57	Success	-	View
exp_self.20260417204926.245_20260417_204926 Paper: self.20260417204926.245	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417204926.245 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 20:50	Success	-	View
exp_self.20260417204202.244_20260417_204202 Paper: self.20260417204202.244	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417204202.244 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 20:43	Success	-	View
exp_pytrain.20260417203937.060_20260417_203937 Paper: pytrain.20260417203937.060	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 20:40	Success	-	View
exp_self.20260417203309.243_20260417_203309 Paper: self.20260417203309.243	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417203309.243 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 20:34	Success	-	View
exp_self.20260417202607.242_20260417_202607 Paper: self.20260417202607.242	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417202607.242 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 20:27	Success	-	View
exp_gh_mtmatheuus_QKV-Core_20260417_202256 Paper: gh_mtmatheuus_QKV-Core	mtmatheuus/QKV-Core Paper ID: gh_mtmatheuus_QKV-Core - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	04-17 20:23	Success	-	View
exp_self.20260417201645.241_20260417_201645 Paper: self.20260417201645.241	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417201645.241 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 20:17	Success	-	View
exp_self.20260417200923.240_20260417_200923 Paper: self.20260417200923.240	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417200923.240 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 20:10	Success	-	View
exp_pytrain.20260417200654.059_20260417_200655 Paper: pytrain.20260417200654.059	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 20:07	Success	-	View
exp_self.20260417200134.239_20260417_200135 Paper: self.20260417200134.239	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417200134.239 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 20:02	Success	-	View
exp_self.20260417195414.238_20260417_195414 Paper: self.20260417195414.238	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417195414.238 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 19:55	Success	-	View
exp_self.20260417194651.237_20260417_194651 Paper: self.20260417194651.237	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417194651.237 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 19:47	Success	-	View
exp_self.20260417193927.236_20260417_193927 Paper: self.20260417193927.236	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417193927.236 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 19:40	Success	-	View
exp_pytrain.20260417193445.058_20260417_193446 Paper: pytrain.20260417193445.058	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 19:35	Success	-	View
exp_self.20260417193248.235_20260417_193248 Paper: self.20260417193248.235	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417193248.235 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 19:33	Success	-	View
exp_self.20260417192526.234_20260417_192527 Paper: self.20260417192526.234	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417192526.234 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 19:26	Success	-	View
exp_self.20260417191807.233_20260417_191808 Paper: self.20260417191807.233	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417191807.233 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 19:19	Success	-	View
exp_self.20260417191045.232_20260417_191046 Paper: self.20260417191045.232	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417191045.232 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 19:11	Success	-	View
exp_self.20260417190328.231_20260417_190328 Paper: self.20260417190328.231	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417190328.231 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 19:04	Success	-	View
exp_pytrain.20260417190109.057_20260417_190109 Paper: pytrain.20260417190109.057	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 19:02	Success	-	View
exp_self.20260417185422.230_20260417_185423 Paper: self.20260417185422.230	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417185422.230 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 18:55	Success	-	View
exp_self.20260417184710.229_20260417_184710 Paper: self.20260417184710.229	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417184710.229 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 18:48	Success	-	View
exp_self.20260417183950.228_20260417_183951 Paper: self.20260417183950.228	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417183950.228 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 18:40	Success	-	View
exp_self.20260417183231.227_20260417_183232 Paper: self.20260417183231.227	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417183231.227 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 18:33	Success	-	View
exp_pytrain.20260417182906.056_20260417_182907 Paper: pytrain.20260417182906.056	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 18:30	Success	-	View
exp_self.20260417182503.226_20260417_182504 Paper: self.20260417182503.226	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417182503.226 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 18:26	Success	-	View
exp_self.20260417181751.225_20260417_181752 Paper: self.20260417181751.225	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417181751.225 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 18:18	Success	-	View
exp_self.20260417181039.224_20260417_181039 Paper: self.20260417181039.224	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417181039.224 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 18:11	Success	-	View
exp_self.20260417180329.223_20260417_180329 Paper: self.20260417180329.223	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417180329.223 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 18:04	Success	-	View
exp_pytrain.20260417175713.055_20260417_175713 Paper: pytrain.20260417175713.055	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 17:58	Success	-	View
exp_self.20260417175524.222_20260417_175525 Paper: self.20260417175524.222	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417175524.222 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 17:56	Success	-	View
exp_self.20260417174812.221_20260417_174813 Paper: self.20260417174812.221	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417174812.221 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 17:49	Success	-	View
exp_self.20260417174103.220_20260417_174103 Paper: self.20260417174103.220	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417174103.220 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 17:42	Success	-	View
exp_self.20260417173347.219_20260417_173348 Paper: self.20260417173347.219	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417173347.219 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 17:34	Success	-	View
exp_self.20260417172625.218_20260417_172626 Paper: self.20260417172625.218	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417172625.218 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 17:27	Success	-	View
exp_pytrain.20260417172338.054_20260417_172339 Paper: pytrain.20260417172338.054	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 17:24	Success	-	View
exp_self.20260417171927.217_20260417_171928 Paper: self.20260417171927.217	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417171927.217 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 17:20	Success	-	View
exp_self.20260417171216.216_20260417_171217 Paper: self.20260417171216.216	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417171216.216 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 17:13	Success	-	View
exp_self.20260417170507.215_20260417_170508 Paper: self.20260417170507.215	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417170507.215 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 17:06	Success	-	View
exp_self.20260417165759.214_20260417_165800 Paper: self.20260417165759.214	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417165759.214 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 16:59	Success	-	View
exp_pytrain.20260417165220.053_20260417_165220 Paper: pytrain.20260417165220.053	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 16:53	Success	-	View
exp_self.20260417165031.213_20260417_165031 Paper: self.20260417165031.213	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417165031.213 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 16:51	Success	-	View
exp_self.20260417164322.212_20260417_164322 Paper: self.20260417164322.212	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417164322.212 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 16:44	Success	-	View
exp_self.20260417163603.211_20260417_163603 Paper: self.20260417163603.211	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417163603.211 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 16:37	Success	-	View
exp_self.20260417162850.210_20260417_162850 Paper: self.20260417162850.210	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417162850.210 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 16:29	Success	-	View
exp_self.20260417162142.209_20260417_162143 Paper: self.20260417162142.209	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417162142.209 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 16:22	Success	-	View
exp_pytrain.20260417161928.052_20260417_161929 Paper: pytrain.20260417161928.052	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 16:20	Success	-	View
exp_self.20260417161413.208_20260417_161413 Paper: self.20260417161413.208	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417161413.208 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 16:15	Success	-	View
exp_self.20260417160603.207_20260417_160603 Paper: self.20260417160603.207	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417160603.207 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 16:07	Success	-	View
exp_self.20260417155849.206_20260417_155849 Paper: self.20260417155849.206	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417155849.206 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 15:59	Success	-	View
exp_self.20260417155139.205_20260417_155139 Paper: self.20260417155139.205	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417155139.205 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 15:52	Success	-	View
exp_pytrain.20260417154813.051_20260417_154813 Paper: pytrain.20260417154813.051	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 15:49	Success	-	View
exp_self.20260417154411.204_20260417_154412 Paper: self.20260417154411.204	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417154411.204 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 15:45	Success	-	View
exp_self.20260417153659.203_20260417_153700 Paper: self.20260417153659.203	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417153659.203 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 15:38	Success	-	View
exp_self.20260417152950.202_20260417_152950 Paper: self.20260417152950.202	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417152950.202 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 15:30	Success	-	View
exp_self.20260417152238.201_20260417_152238 Paper: self.20260417152238.201	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417152238.201 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 15:23	Success	-	View
exp_pytrain.20260417151658.050_20260417_151659 Paper: pytrain.20260417151658.050	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 15:18	Success	-	View
exp_self.20260417151511.200_20260417_151511 Paper: self.20260417151511.200	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417151511.200 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 15:16	Success	-	View
exp_self.20260417150800.199_20260417_150801 Paper: self.20260417150800.199	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417150800.199 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 15:09	Success	-	View
exp_self.20260417150042.198_20260417_150042 Paper: self.20260417150042.198	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417150042.198 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 15:01	Success	-	View
exp_self.20260417145326.197_20260417_145327 Paper: self.20260417145326.197	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417145326.197 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 14:54	Success	-	View
exp_self.20260417144614.196_20260417_144614 Paper: self.20260417144614.196	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417144614.196 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 14:47	Success	-	View
exp_pytrain.20260417144359.049_20260417_144400 Paper: pytrain.20260417144359.049	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 14:45	Success	-	View
exp_self.20260417143708.195_20260417_143708 Paper: self.20260417143708.195	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417143708.195 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 14:38	Success	-	View
exp_self.20260417142947.194_20260417_142947 Paper: self.20260417142947.194	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417142947.194 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 14:30	Success	-	View
exp_self.20260417142230.193_20260417_142230 Paper: self.20260417142230.193	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417142230.193 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 14:23	Success	-	View
exp_self.20260417141504.192_20260417_141504 Paper: self.20260417141504.192	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417141504.192 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 14:16	Success	-	View
exp_pytrain.20260417141242.048_20260417_141242 Paper: pytrain.20260417141242.048	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 14:13	Success	-	View
exp_self.20260417140727.191_20260417_140727 Paper: self.20260417140727.191	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417140727.191 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 14:08	Success	-	View
exp_self.20260417135954.190_20260417_135954 Paper: self.20260417135954.190	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417135954.190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 14:00	Success	-	View
exp_self.20260417135219.189_20260417_135219 Paper: self.20260417135219.189	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417135219.189 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 13:53	Success	-	View
exp_self.20260417134451.188_20260417_134451 Paper: self.20260417134451.188	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417134451.188 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 13:45	Success	-	View
exp_pytrain.20260417134121.047_20260417_134122 Paper: pytrain.20260417134121.047	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 13:42	Success	-	View
exp_self.20260417133718.187_20260417_133719 Paper: self.20260417133718.187	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417133718.187 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 13:38	Success	-	View
exp_self.20260417133007.186_20260417_133007 Paper: self.20260417133007.186	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417133007.186 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 13:31	Success	-	View
exp_self.20260417132254.185_20260417_132255 Paper: self.20260417132254.185	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417132254.185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 13:23	Success	-	View
exp_self.20260417131544.184_20260417_131544 Paper: self.20260417131544.184	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417131544.184 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 13:16	Success	-	View
exp_pytrain.20260417131002.046_20260417_131003 Paper: pytrain.20260417131002.046	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 13:11	Success	-	View
exp_self.20260417130813.183_20260417_130813 Paper: self.20260417130813.183	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417130813.183 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 13:09	Success	-	View
exp_self.20260417130057.182_20260417_130057 Paper: self.20260417130057.182	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417130057.182 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 13:01	Success	-	View
exp_self.20260417125338.181_20260417_125339 Paper: self.20260417125338.181	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417125338.181 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 12:54	Success	-	View
exp_self.20260417124621.180_20260417_124621 Paper: self.20260417124621.180	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417124621.180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 12:47	Success	-	View
exp_self.20260417123909.179_20260417_123910 Paper: self.20260417123909.179	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417123909.179 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 12:40	Success	-	View
exp_pytrain.20260417123656.045_20260417_123656 Paper: pytrain.20260417123656.045	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 12:37	Success	-	View
exp_self.20260417123000.178_20260417_123001 Paper: self.20260417123000.178	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417123000.178 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 12:31	Success	-	View
exp_self.20260417122220.177_20260417_122220 Paper: self.20260417122220.177	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417122220.177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 12:23	Success	-	View
exp_self.20260417121509.176_20260417_121509 Paper: self.20260417121509.176	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417121509.176 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 12:16	Success	-	View
exp_self.20260417120757.175_20260417_120758 Paper: self.20260417120757.175	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417120757.175 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 12:09	Success	-	View
exp_pytrain.20260417120537.044_20260417_120538 Paper: pytrain.20260417120537.044	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 12:06	Success	-	View
exp_self.20260417115857.174_20260417_115858 Paper: self.20260417115857.174	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417115857.174 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 12:00	Success	-	View
exp_self.20260417115139.173_20260417_115139 Paper: self.20260417115139.173	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417115139.173 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 11:52	Success	-	View
exp_self.20260417114425.172_20260417_114426 Paper: self.20260417114425.172	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417114425.172 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 11:45	Success	-	View
exp_self.20260417113712.171_20260417_113712 Paper: self.20260417113712.171	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417113712.171 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 11:38	Success	-	View
exp_pytrain.20260417113347.043_20260417_113347 Paper: pytrain.20260417113347.043	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 11:34	Success	-	View
exp_self.20260417112944.170_20260417_112944 Paper: self.20260417112944.170	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417112944.170 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 11:30	Success	-	View
exp_self.20260417112229.169_20260417_112230 Paper: self.20260417112229.169	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417112229.169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 11:23	Success	-	View
exp_self.20260417111517.168_20260417_111518 Paper: self.20260417111517.168	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417111517.168 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 11:16	Success	-	View
exp_self.20260417110801.167_20260417_110801 Paper: self.20260417110801.167	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417110801.167 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 11:09	Success	-	View
exp_pytrain.20260417110227.042_20260417_110228 Paper: pytrain.20260417110227.042	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 11:03	Success	-	View
exp_self.20260417110040.166_20260417_110040 Paper: self.20260417110040.166	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417110040.166 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 11:01	Success	-	View
exp_self.20260417105319.165_20260417_105319 Paper: self.20260417105319.165	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417105319.165 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 10:54	Success	-	View
exp_self.20260417104605.164_20260417_104605 Paper: self.20260417104605.164	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417104605.164 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 10:47	Success	-	View
exp_self.20260417103853.163_20260417_103853 Paper: self.20260417103853.163	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417103853.163 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 10:39	Success	-	View
exp_self.20260417103137.162_20260417_103137 Paper: self.20260417103137.162	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417103137.162 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 10:32	Success	-	View
exp_pytrain.20260417102916.041_20260417_102917 Paper: pytrain.20260417102916.041	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 10:30	Success	-	View
exp_self.20260417102407.161_20260417_102407 Paper: self.20260417102407.161	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417102407.161 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 10:25	Success	-	View
exp_self.20260417101654.160_20260417_101655 Paper: self.20260417101654.160	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417101654.160 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 10:17	Success	-	View
exp_self.20260417100927.159_20260417_100928 Paper: self.20260417100927.159	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417100927.159 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 10:10	Success	-	View
exp_self.20260417100203.158_20260417_100204 Paper: self.20260417100203.158	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417100203.158 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 10:03	Success	-	View
exp_pytrain.20260417095726.040_20260417_095726 Paper: pytrain.20260417095726.040	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 09:58	Success	-	View
exp_self.20260417095424.157_20260417_095425 Paper: self.20260417095424.157	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417095424.157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 09:55	Success	-	View
exp_self.20260417094657.156_20260417_094657 Paper: self.20260417094657.156	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417094657.156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 09:47	Success	-	View
exp_self.20260417093932.155_20260417_093936 Paper: self.20260417093932.155	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417093932.155 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 09:40	Success	-	View
exp_self.20260417093155.154_20260417_093155 Paper: self.20260417093155.154	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417093155.154 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 09:32	Success	-	View
exp_pytrain.20260417092540.039_20260417_092540 Paper: pytrain.20260417092540.039	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 09:26	Success	-	View
exp_self.20260417092351.153_20260417_092351 Paper: self.20260417092351.153	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417092351.153 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 09:24	Success	-	View
exp_self.20260417091639.152_20260417_091640 Paper: self.20260417091639.152	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417091639.152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 09:17	Success	-	View
exp_self.20260417090926.151_20260417_090926 Paper: self.20260417090926.151	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417090926.151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 09:10	Success	-	View
exp_self.20260417090207.150_20260417_090208 Paper: self.20260417090207.150	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417090207.150 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 09:03	Success	-	View
exp_self.20260417085450.149_20260417_085451 Paper: self.20260417085450.149	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417085450.149 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 08:55	Success	-	View
exp_pytrain.20260417085236.038_20260417_085236 Paper: pytrain.20260417085236.038	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 08:53	Success	-	View
exp_self.20260417084720.148_20260417_084720 Paper: self.20260417084720.148	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417084720.148 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 08:48	Success	-	View
exp_hf_2211.16780_20260417_084412 Paper: hf_2211.16780	An Optimal Transport-driven Approach for Cultivating Latent Space in Online Incremental Learning Paper ID: hf_2211.16780 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-17 08:45	Success	-	View
exp_self.20260417083851.147_20260417_083852 Paper: self.20260417083851.147	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417083851.147 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 08:39	Success	-	View
exp_self.20260417083105.146_20260417_083106 Paper: self.20260417083105.146	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417083105.146 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 08:32	Success	-	View
exp_self.20260417082324.145_20260417_082324 Paper: self.20260417082324.145	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417082324.145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 08:24	Success	-	View
exp_pytrain.20260417082043.037_20260417_082043 Paper: pytrain.20260417082043.037	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 08:21	Success	-	View
exp_self.20260417081448.144_20260417_081448 Paper: self.20260417081448.144	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417081448.144 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 08:15	Success	-	View
exp_self.20260417080715.143_20260417_080716 Paper: self.20260417080715.143	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417080715.143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 08:08	Success	-	View
exp_self.20260417075935.142_20260417_075935 Paper: self.20260417075935.142	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417075935.142 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 08:00	Success	-	View
exp_self.20260417075156.141_20260417_075157 Paper: self.20260417075156.141	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417075156.141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 07:52	Success	-	View
exp_pytrain.20260417074917.036_20260417_074917 Paper: pytrain.20260417074917.036	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 07:50	Success	-	View
exp_self.20260417074216.140_20260417_074216 Paper: self.20260417074216.140	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417074216.140 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 07:43	Success	-	View
exp_self.20260417073447.139_20260417_073448 Paper: self.20260417073447.139	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417073447.139 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 07:35	Success	-	View
exp_self.20260417072717.138_20260417_072717 Paper: self.20260417072717.138	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417072717.138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 07:28	Success	-	View
exp_self.20260417071940.137_20260417_071941 Paper: self.20260417071940.137	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417071940.137 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 07:20	Success	-	View
exp_pytrain.20260417071709.035_20260417_071709 Paper: pytrain.20260417071709.035	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 07:18	Success	-	View
exp_self.20260417071004.136_20260417_071005 Paper: self.20260417071004.136	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417071004.136 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 07:11	Success	-	View
exp_self.20260417070224.135_20260417_070224 Paper: self.20260417070224.135	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417070224.135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 07:03	Success	-	View
exp_self.20260417065445.134_20260417_065445 Paper: self.20260417065445.134	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417065445.134 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 06:55	Success	-	View
exp_self.20260417064714.133_20260417_064714 Paper: self.20260417064714.133	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417064714.133 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 06:48	Success	-	View
exp_pytrain.20260417064437.034_20260417_064437 Paper: pytrain.20260417064437.034	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 06:45	Success	-	View
exp_self.20260417063744.132_20260417_063744 Paper: self.20260417063744.132	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417063744.132 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 06:38	Success	-	View
exp_cr_10.3390_app16083892_20260417_063317 Paper: cr_10.3390_app16083892	Latent Diffusion Model for Chlorophyll Remote Sensing Spectral Synthesis Integrating Bio-Optical Priors and Band Attenti... Paper ID: cr_10.3390_app16083892 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	04-17 06:34	Success	-	View
exp_self.20260417063002.131_20260417_063002 Paper: self.20260417063002.131	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417063002.131 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 06:31	Success	-	View
exp_self.20260417062224.130_20260417_062224 Paper: self.20260417062224.130	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417062224.130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 06:23	Success	-	View
exp_cr_10.1145_3807782_20260417_061749 Paper: cr_10.1145_3807782	Efficient Addition-Based Sparse GEMM for Fast Ternary Large Language Model Inference on Edge Devices Paper ID: cr_10.1145_3807782 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered bench...	04-17 06:18	Success	-	View
exp_self.20260417061523.129_20260417_061524 Paper: self.20260417061523.129	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417061523.129 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 06:16	Success	-	View
exp_pytrain.20260417061254.033_20260417_061254 Paper: pytrain.20260417061254.033	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 06:13	Success	-	View
exp_self.20260417060549.128_20260417_060549 Paper: self.20260417060549.128	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417060549.128 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 06:06	Success	-	View
exp_self.20260417055815.127_20260417_055815 Paper: self.20260417055815.127	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417055815.127 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 05:59	Success	-	View
exp_self.20260417055036.126_20260417_055036 Paper: self.20260417055036.126	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417055036.126 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 05:51	Success	-	View
exp_self.20260417054307.125_20260417_054307 Paper: self.20260417054307.125	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417054307.125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 05:44	Success	-	View
exp_pytrain.20260417054037.032_20260417_054037 Paper: pytrain.20260417054037.032	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 05:41	Success	-	View
exp_self.20260417053502.124_20260417_053503 Paper: self.20260417053502.124	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417053502.124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 05:36	Success	-	View
exp_self.20260417052728.123_20260417_052728 Paper: self.20260417052728.123	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417052728.123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 05:28	Success	-	View
exp_self.20260417051943.122_20260417_051944 Paper: self.20260417051943.122	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417051943.122 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 05:20	Success	-	View
exp_self.20260417051157.121_20260417_051158 Paper: self.20260417051157.121	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417051157.121 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 05:13	Success	-	View
exp_pytrain.20260417050917.031_20260417_050918 Paper: pytrain.20260417050917.031	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 05:10	Success	-	View
exp_self.20260417050347.120_20260417_050348 Paper: self.20260417050347.120	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417050347.120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 05:04	Success	-	View
exp_self.20260417045553.119_20260417_045554 Paper: self.20260417045553.119	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417045553.119 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 04:56	Success	-	View
exp_self.20260417044816.118_20260417_044817 Paper: self.20260417044816.118	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417044816.118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 04:49	Success	-	View
exp_self.20260417044043.117_20260417_044043 Paper: self.20260417044043.117	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417044043.117 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 04:41	Success	-	View
exp_pytrain.20260417043753.030_20260417_043753 Paper: pytrain.20260417043753.030	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 04:38	Success	-	View
exp_hf_2604.14572_20260417_043506 Paper: hf_2604.14572	Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG Paper ID: hf_2604.14572 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-17 04:36	Success	-	View
exp_self.20260417043037.116_20260417_043037 Paper: self.20260417043037.116	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417043037.116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 04:31	Success	-	View
exp_self.20260417042240.115_20260417_042240 Paper: self.20260417042240.115	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417042240.115 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 04:23	Success	-	View
exp_self.20260417041503.114_20260417_041504 Paper: self.20260417041503.114	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417041503.114 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 04:16	Success	-	View
exp_self.20260417040721.113_20260417_040721 Paper: self.20260417040721.113	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417040721.113 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 04:08	Success	-	View
exp_pytrain.20260417040450.029_20260417_040451 Paper: pytrain.20260417040450.029	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 04:05	Success	-	View
exp_self.20260417035743.112_20260417_035743 Paper: self.20260417035743.112	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417035743.112 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 03:58	Success	-	View
exp_self.20260417035011.111_20260417_035012 Paper: self.20260417035011.111	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417035011.111 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 03:51	Success	-	View
exp_self.20260417034243.110_20260417_034244 Paper: self.20260417034243.110	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417034243.110 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 03:43	Success	-	View
exp_self.20260417033510.109_20260417_033511 Paper: self.20260417033510.109	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417033510.109 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 03:36	Success	-	View
exp_pytrain.20260417033241.028_20260417_033241 Paper: pytrain.20260417033241.028	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 03:33	Success	-	View
exp_self.20260417032542.108_20260417_032542 Paper: self.20260417032542.108	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417032542.108 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 03:26	Success	-	View
exp_self.20260417031813.107_20260417_031815 Paper: self.20260417031813.107	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417031813.107 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 03:19	Success	-	View
exp_self.20260417031041.106_20260417_031041 Paper: self.20260417031041.106	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417031041.106 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 03:11	Success	-	View
exp_hf_2604.14629_20260417_030721 Paper: hf_2604.14629	Switch-KD: Visual-Switch Knowledge Distillation for Vision-Language Models Paper ID: hf_2604.14629 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-17 03:08	Success	-	View
exp_self.20260417030241.105_20260417_030243 Paper: self.20260417030241.105	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417030241.105 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 03:03	Success	-	View
exp_pytrain.20260417030001.027_20260417_030001 Paper: pytrain.20260417030001.027	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 03:01	Success	-	View
exp_self.20260417025532.104_20260417_025532 Paper: self.20260417025532.104	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417025532.104 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 02:56	Success	-	View
exp_self.20260417024752.103_20260417_024752 Paper: self.20260417024752.103	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417024752.103 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 02:48	Success	-	View
exp_self.20260417024016.102_20260417_024016 Paper: self.20260417024016.102	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417024016.102 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 02:41	Success	-	View
exp_self.20260417023231.101_20260417_023231 Paper: self.20260417023231.101	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417023231.101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 02:33	Success	-	View
exp_pytrain.20260417022831.026_20260417_022832 Paper: pytrain.20260417022831.026	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 02:29	Success	-	View
exp_self.20260417022511.100_20260417_022512 Paper: self.20260417022511.100	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417022511.100 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 02:26	Success	-	View
exp_self.20260417021730.099_20260417_021730 Paper: self.20260417021730.099	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417021730.099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 02:18	Success	-	View
exp_self.20260417020944.098_20260417_020944 Paper: self.20260417020944.098	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417020944.098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 02:10	Success	-	View
exp_self.20260417020214.097_20260417_020214 Paper: self.20260417020214.097	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417020214.097 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 02:03	Success	-	View
exp_hf_2604.14531_20260417_015916 Paper: hf_2604.14531	TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification Paper ID: hf_2604.14531 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-17 02:00	Success	-	View
exp_pytrain.20260417015711.025_20260417_015711 Paper: pytrain.20260417015711.025	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 01:58	Success	-	View
exp_self.20260417015247.096_20260417_015247 Paper: self.20260417015247.096	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417015247.096 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 01:53	Success	-	View
exp_self.20260417014516.095_20260417_014516 Paper: self.20260417014516.095	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417014516.095 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 01:46	Success	-	View
exp_self.20260417013737.094_20260417_013738 Paper: self.20260417013737.094	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417013737.094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 01:38	Success	-	View
exp_self.20260417013009.093_20260417_013009 Paper: self.20260417013009.093	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417013009.093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 01:31	Success	-	View
exp_pytrain.20260417012524.024_20260417_012524 Paper: pytrain.20260417012524.024	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 01:26	Success	-	View
exp_self.20260417012313.092_20260417_012313 Paper: self.20260417012313.092	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417012313.092 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 01:24	Success	-	View
exp_hf_2604.11661_20260417_011838 Paper: hf_2604.11661	Towards Autonomous Mechanistic Reasoning in Virtual Cells Paper ID: hf_2604.11661 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-17 01:19	Success	-	View
exp_self.20260417011414.091_20260417_011414 Paper: self.20260417011414.091	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417011414.091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 01:15	Success	-	View
exp_self.20260417010446.090_20260417_010446 Paper: self.20260417010446.090	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417010446.090 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 01:05	Success	-	View
exp_self.20260417005649.089_20260417_005650 Paper: self.20260417005649.089	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417005649.089 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 00:57	Success	-	View
exp_gh_msu-denver_bili-core_20260417_005401 Paper: gh_msu-denver_bili-core	msu-denver/bili-core Paper ID: gh_msu-denver_bili-core - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 00:55	Success	-	View
exp_pytrain.20260417005151.023_20260417_005152 Paper: pytrain.20260417005151.023	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 00:52	Success	-	View
exp_self.20260417004633.088_20260417_004633 Paper: self.20260417004633.088	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417004633.088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 00:47	Success	-	View
exp_self.20260417003905.087_20260417_003906 Paper: self.20260417003905.087	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417003905.087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 00:40	Success	-	View
exp_self.20260417003102.086_20260417_003102 Paper: self.20260417003102.086	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417003102.086 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 00:32	Success	-	View
exp_self.20260417002221.085_20260417_002222 Paper: self.20260417002221.085	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417002221.085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 00:23	Success	-	View
exp_pytrain.20260417001957.022_20260417_001959 Paper: pytrain.20260417001957.022	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-17 00:21	Success	-	View
exp_hf_2604.15284_20260417_001536 Paper: hf_2604.15284	GlobalSplat: Efficient Feed-Forward 3D Gaussian Splatting via Global Scene Tokens Paper ID: hf_2604.15284 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-17 00:16	Success	-	View
exp_self.20260417001225.084_20260417_001225 Paper: self.20260417001225.084	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417001225.084 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 00:13	Success	-	View
exp_self.20260417000459.083_20260417_000500 Paper: self.20260417000459.083	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260417000459.083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-17 00:06	Success	-	View
exp_self.20260416235727.082_20260416_235728 Paper: self.20260416235727.082	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416235727.082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 23:58	Success	-	View
exp_self.20260416235006.081_20260416_235007 Paper: self.20260416235006.081	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416235006.081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 23:51	Success	-	View
exp_pytrain.20260416234734.021_20260416_234734 Paper: pytrain.20260416234734.021	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 23:48	Success	-	View
exp_self.20260416234042.080_20260416_234042 Paper: self.20260416234042.080	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416234042.080 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 23:41	Success	-	View
exp_self.20260416233309.079_20260416_233310 Paper: self.20260416233309.079	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416233309.079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 23:34	Success	-	View
exp_self.20260416232538.078_20260416_232538 Paper: self.20260416232538.078	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416232538.078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 23:26	Success	-	View
exp_self.20260416231809.077_20260416_231810 Paper: self.20260416231809.077	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416231809.077 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 23:19	Success	-	View
exp_pytrain.20260416231534.020_20260416_231534 Paper: pytrain.20260416231534.020	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 23:16	Success	-	View
exp_self.20260416230841.076_20260416_230841 Paper: self.20260416230841.076	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416230841.076 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 23:09	Success	-	View
exp_self.20260416230107.075_20260416_230108 Paper: self.20260416230107.075	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416230107.075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 23:02	Success	-	View
exp_self.20260416225341.074_20260416_225342 Paper: self.20260416225341.074	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416225341.074 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 22:54	Success	-	View
exp_gh_sakhama_memfuse_20260416_224810 Paper: gh_sakhama_memfuse	sakhama/memfuse Paper ID: gh_sakhama_memfuse - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered bench...	04-16 22:49	Success	-	View
exp_self.20260416224558.073_20260416_224559 Paper: self.20260416224558.073	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416224558.073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 22:47	Success	-	View
exp_pytrain.20260416224331.019_20260416_224331 Paper: pytrain.20260416224331.019	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 22:44	Success	-	View
exp_self.20260416223858.072_20260416_223858 Paper: self.20260416223858.072	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416223858.072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 22:40	Success	-	View
exp_self.20260416223128.071_20260416_223128 Paper: self.20260416223128.071	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416223128.071 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 22:32	Success	-	View
exp_hf_2604.14125_20260416_222809 Paper: hf_2604.14125	HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System Paper ID: hf_2604.14125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 22:29	Success	-	View
exp_self.20260416222347.070_20260416_222348 Paper: self.20260416222347.070	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416222347.070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 22:24	Success	-	View
exp_gh_Daubingweirdie414_multimodal-rag_20260416_221819 Paper: gh_Daubingweirdie414_multimodal-rag	Daubingweirdie414/multimodal-rag Paper ID: gh_Daubingweirdie414_multimodal-rag - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal...	04-16 22:19	Success	-	View
exp_self.20260416221607.069_20260416_221607 Paper: self.20260416221607.069	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416221607.069 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 22:17	Success	-	View
exp_gh_Mustii2009_NeuroRag_20260416_221320 Paper: gh_Mustii2009_NeuroRag	Mustii2009/NeuroRag Paper ID: gh_Mustii2009_NeuroRag - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered b...	04-16 22:14	Success	-	View
exp_pytrain.20260416221112.018_20260416_221113 Paper: pytrain.20260416221112.018	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 22:12	Success	-	View
exp_hf_2509.25843_20260416_220826 Paper: hf_2509.25843	ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack Paper ID: hf_2509.25843 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 22:09	Success	-	View
exp_self.20260416220512.068_20260416_220513 Paper: self.20260416220512.068	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416220512.068 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 22:06	Success	-	View
exp_2604.15306v1_20260416_220223 Paper: 2604.15306v1	Generalization in LLM Problem Solving: The Case of the Shortest Path Paper ID: 2604.15306v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-16 22:03	Success	-	View
exp_self.20260416215529.067_20260416_215530 Paper: self.20260416215529.067	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416215529.067 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 21:56	Success	-	View
exp_2604.15308v1_20260416_215106 Paper: 2604.15308v1	RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework Paper ID: 2604.15308v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-16 21:52	Success	-	View
exp_self.20260416214755.066_20260416_214755 Paper: self.20260416214755.066	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416214755.066 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 21:48	Success	-	View
exp_self.20260416214035.065_20260416_214035 Paper: self.20260416214035.065	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416214035.065 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 21:41	Success	-	View
exp_pytrain.20260416213803.017_20260416_213804 Paper: pytrain.20260416213803.017	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 21:39	Success	-	View
exp_hf_2604.14922_20260416_213548 Paper: hf_2604.14922	LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning Paper ID: hf_2604.14922 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 21:36	Success	-	View
exp_self.20260416213236.064_20260416_213236 Paper: self.20260416213236.064	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416213236.064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 21:33	Success	-	View
exp_hf_2604.14967_20260416_212947 Paper: hf_2604.14967	UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards Paper ID: hf_2604.14967 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 21:30	Success	-	View
exp_hf_2604.15308_20260416_212546 Paper: hf_2604.15308	RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework Paper ID: hf_2604.15308 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 21:26	Success	-	View
exp_self.20260416212346.063_20260416_212347 Paper: self.20260416212346.063	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416212346.063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 21:24	Success	-	View
exp_self.20260416211618.062_20260416_211618 Paper: self.20260416211618.062	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416211618.062 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 21:17	Success	-	View
exp_hf_2604.14683_20260416_211303 Paper: hf_2604.14683	DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation Paper ID: hf_2604.14683 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 21:14	Success	-	View
exp_self.20260416210846.061_20260416_210847 Paper: self.20260416210846.061	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416210846.061 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 21:09	Success	-	View
exp_pytrain.20260416210615.016_20260416_210616 Paper: pytrain.20260416210615.016	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 21:07	Success	-	View
exp_hf_2604.13226_20260416_210332 Paper: hf_2604.13226	KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs Paper ID: hf_2604.13226 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 21:04	Success	-	View
exp_self.20260416205914.060_20260416_205914 Paper: self.20260416205914.060	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416205914.060 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 21:00	Success	-	View
exp_2604.15167v1_20260416_205449 Paper: 2604.15167v1	When Flat Minima Fail: Characterizing INT4 Quantization Collapse After FP32 Convergence Paper ID: 2604.15167v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-16 20:55	Success	-	View
exp_self.20260416205136.059_20260416_205136 Paper: self.20260416205136.059	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416205136.059 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 20:52	Success	-	View
exp_2604.15174v1_20260416_204849 Paper: 2604.15174v1	MambaSL: Exploring Single-Layer Mamba for Time Series Classification Paper ID: 2604.15174v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-16 20:49	Success	-	View
exp_self.20260416204157.058_20260416_204157 Paper: self.20260416204157.058	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416204157.058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 20:43	Success	-	View
exp_self.20260416203440.057_20260416_203441 Paper: self.20260416203440.057	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416203440.057 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 20:35	Success	-	View
exp_pytrain.20260416203212.015_20260416_203213 Paper: pytrain.20260416203212.015	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 20:33	Success	-	View
exp_self.20260416202511.056_20260416_202512 Paper: self.20260416202511.056	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416202511.056 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 20:26	Success	-	View
exp_self.20260416201745.055_20260416_201746 Paper: self.20260416201745.055	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416201745.055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 20:18	Success	-	View
exp_self.20260416201012.054_20260416_201013 Paper: self.20260416201012.054	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416201012.054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 20:11	Success	-	View
exp_self.20260416200237.053_20260416_200237 Paper: self.20260416200237.053	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416200237.053 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 20:03	Success	-	View
exp_pytrain.20260416200008.014_20260416_200009 Paper: pytrain.20260416200008.014	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 20:01	Success	-	View
exp_self.20260416195307.052_20260416_195307 Paper: self.20260416195307.052	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416195307.052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 19:54	Success	-	View
exp_self.20260416194534.051_20260416_194534 Paper: self.20260416194534.051	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416194534.051 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 19:46	Success	-	View
exp_self.20260416193803.050_20260416_193803 Paper: self.20260416193803.050	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416193803.050 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 19:39	Success	-	View
exp_self.20260416193026.049_20260416_193026 Paper: self.20260416193026.049	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416193026.049 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 19:31	Success	-	View
exp_pytrain.20260416192757.013_20260416_192758 Paper: pytrain.20260416192757.013	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 19:29	Success	-	View
exp_self.20260416192051.048_20260416_192051 Paper: self.20260416192051.048	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416192051.048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 19:21	Success	-	View
exp_self.20260416191323.047_20260416_191323 Paper: self.20260416191323.047	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416191323.047 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 19:14	Success	-	View
exp_gh_qualcomm_ai-hub-models_20260416_190751 Paper: gh_qualcomm_ai-hub-models	qualcomm/ai-hub-models Paper ID: gh_qualcomm_ai-hub-models - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovere...	04-16 19:08	Success	-	View
exp_self.20260416190543.046_20260416_190543 Paper: self.20260416190543.046	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416190543.046 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 19:06	Success	-	View
exp_self.20260416185810.045_20260416_185811 Paper: self.20260416185810.045	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416185810.045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 18:59	Success	-	View
exp_pytrain.20260416185542.012_20260416_185542 Paper: pytrain.20260416185542.012	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 18:56	Success	-	View
exp_self.20260416184839.044_20260416_184839 Paper: self.20260416184839.044	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416184839.044 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 18:49	Success	-	View
exp_self.20260416184111.043_20260416_184111 Paper: self.20260416184111.043	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416184111.043 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 18:42	Success	-	View
exp_self.20260416183348.042_20260416_183348 Paper: self.20260416183348.042	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416183348.042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 18:34	Success	-	View
exp_self.20260416182624.041_20260416_182624 Paper: self.20260416182624.041	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416182624.041 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 18:27	Success	-	View
exp_pytrain.20260416182358.011_20260416_182358 Paper: pytrain.20260416182358.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 18:25	Success	-	View
exp_self.20260416181706.040_20260416_181706 Paper: self.20260416181706.040	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416181706.040 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 18:18	Success	-	View
exp_self.20260416180940.039_20260416_180940 Paper: self.20260416180940.039	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416180940.039 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 18:10	Success	-	View
exp_self.20260416180203.038_20260416_180204 Paper: self.20260416180203.038	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416180203.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 18:03	Success	-	View
exp_self.20260416175437.037_20260416_175438 Paper: self.20260416175437.037	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416175437.037 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 17:55	Success	-	View
exp_pytrain.20260416175209.010_20260416_175209 Paper: pytrain.20260416175209.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 17:53	Success	-	View
exp_self.20260416174514.036_20260416_174514 Paper: self.20260416174514.036	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416174514.036 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 17:46	Success	-	View
exp_self.20260416173747.035_20260416_173747 Paper: self.20260416173747.035	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416173747.035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 17:38	Success	-	View
exp_self.20260416173015.034_20260416_173016 Paper: self.20260416173015.034	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416173015.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 17:31	Success	-	View
exp_self.20260416172247.033_20260416_172248 Paper: self.20260416172247.033	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416172247.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 17:23	Success	-	View
exp_pytrain.20260416172018.009_20260416_172019 Paper: pytrain.20260416172018.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 17:21	Success	-	View
exp_self.20260416171323.032_20260416_171323 Paper: self.20260416171323.032	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416171323.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 17:14	Success	-	View
exp_self.20260416170559.031_20260416_170559 Paper: self.20260416170559.031	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416170559.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 17:07	Success	-	View
exp_self.20260416165831.030_20260416_165832 Paper: self.20260416165831.030	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416165831.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 16:59	Success	-	View
exp_self.20260416165101.029_20260416_165102 Paper: self.20260416165101.029	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416165101.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 16:52	Success	-	View
exp_pytrain.20260416164832.008_20260416_164832 Paper: pytrain.20260416164832.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 16:49	Success	-	View
exp_self.20260416164137.028_20260416_164138 Paper: self.20260416164137.028	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416164137.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 16:42	Success	-	View
exp_self.20260416163411.027_20260416_163411 Paper: self.20260416163411.027	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416163411.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 16:35	Success	-	View
exp_self.20260416162644.026_20260416_162644 Paper: self.20260416162644.026	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416162644.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 16:27	Success	-	View
exp_self.20260416161913.025_20260416_161913 Paper: self.20260416161913.025	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416161913.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 16:20	Success	-	View
exp_pytrain.20260416161643.007_20260416_161644 Paper: pytrain.20260416161643.007	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 16:17	Success	-	View
exp_self.20260416160948.024_20260416_160948 Paper: self.20260416160948.024	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416160948.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 16:10	Success	-	View
exp_self.20260416160222.023_20260416_160222 Paper: self.20260416160222.023	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416160222.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 16:03	Success	-	View
exp_self.20260416155454.022_20260416_155454 Paper: self.20260416155454.022	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416155454.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 15:55	Success	-	View
exp_self.20260416154723.021_20260416_154723 Paper: self.20260416154723.021	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416154723.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 15:48	Success	-	View
exp_pytrain.20260416154448.006_20260416_154449 Paper: pytrain.20260416154448.006	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 15:45	Success	-	View
exp_self.20260416153753.020_20260416_153754 Paper: self.20260416153753.020	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416153753.020 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 15:38	Success	-	View
exp_self.20260416153017.019_20260416_153018 Paper: self.20260416153017.019	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416153017.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 15:31	Success	-	View
exp_self.20260416152241.018_20260416_152241 Paper: self.20260416152241.018	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416152241.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 15:23	Success	-	View
exp_self.20260416151505.017_20260416_151506 Paper: self.20260416151505.017	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416151505.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 15:16	Success	-	View
exp_pytrain.20260416151227.005_20260416_151227 Paper: pytrain.20260416151227.005	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 15:13	Success	-	View
exp_self.20260416150518.016_20260416_150518 Paper: self.20260416150518.016	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416150518.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 15:06	Success	-	View
exp_self.20260416145733.015_20260416_145734 Paper: self.20260416145733.015	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416145733.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 14:58	Success	-	View
exp_self.20260416144955.014_20260416_144955 Paper: self.20260416144955.014	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416144955.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 14:50	Success	-	View
exp_self.20260416144223.013_20260416_144223 Paper: self.20260416144223.013	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416144223.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 14:43	Success	-	View
exp_pytrain.20260416143947.004_20260416_143948 Paper: pytrain.20260416143947.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 14:40	Success	-	View
exp_hf_2604.11490_20260416_143701 Paper: hf_2604.11490	Anthropogenic Regional Adaptation in Multimodal Vision-Language Model Paper ID: hf_2604.11490 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 14:38	Success	-	View
exp_self.20260416143345.012_20260416_143345 Paper: self.20260416143345.012	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416143345.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 14:34	Success	-	View
exp_hf_2604.12002_20260416_143028 Paper: hf_2604.12002	Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision Paper ID: hf_2604.12002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 14:31	Success	-	View
exp_self.20260416142601.011_20260416_142601 Paper: self.20260416142601.011	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416142601.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 14:27	Success	-	View
exp_self.20260416141819.010_20260416_141819 Paper: self.20260416141819.010	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416141819.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 14:19	Success	-	View
exp_hf_2604.11748_20260416_141246 Paper: hf_2604.11748	LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling Paper ID: hf_2604.11748 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 14:13	Success	-	View
exp_self.20260416141034.009_20260416_141034 Paper: self.20260416141034.009	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416141034.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 14:11	Success	-	View
exp_pytrain.20260416140752.003_20260416_140752 Paper: pytrain.20260416140752.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 14:08	Success	-	View
exp_self.20260416140042.008_20260416_140043 Paper: self.20260416140042.008	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416140042.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 14:01	Success	-	View
exp_self.20260416135311.007_20260416_135312 Paper: self.20260416135311.007	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416135311.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 13:54	Success	-	View
exp_self.20260416134531.006_20260416_134531 Paper: self.20260416134531.006	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416134531.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 13:46	Success	-	View
exp_self.20260416133750.005_20260416_133750 Paper: self.20260416133750.005	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416133750.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 13:38	Success	-	View
exp_pytrain.20260416133514.002_20260416_133514 Paper: pytrain.20260416133514.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 13:36	Success	-	View
exp_hf_2604.03088_20260416_133249 Paper: hf_2604.03088	SkVM: Compiling Skills for Efficient Execution Everywhere Paper ID: hf_2604.03088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 13:33	Success	-	View
exp_self.20260416133041.004_20260416_133041 Paper: self.20260416133041.004	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416133041.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 13:31	Success	-	View
exp_self.20260416132244.003_20260416_132244 Paper: self.20260416132244.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416132244.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 13:23	Success	-	View
exp_self.20260416131459.002_20260416_131459 Paper: self.20260416131459.002	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416131459.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 13:16	Success	-	View
exp_self.20260416130724.001_20260416_130724 Paper: self.20260416130724.001	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416130724.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 13:08	Success	-	View
exp_pytrain.20260416130350.001_20260416_130351 Paper: pytrain.20260416130350.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 13:04	Success	-	View
exp_self.20260416124116.001_20260416_124116 Paper: self.20260416124116.001	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416124116.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 12:41	Pending	-	View
exp_pytrain.20260416123843.001_20260416_123843 Paper: pytrain.20260416123843.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 12:39	Success	-	View
exp_self.20260416123358.015_20260416_123358 Paper: self.20260416123358.015	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416123358.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 12:35	Success	-	View
exp_self.20260416122616.014_20260416_122616 Paper: self.20260416122616.014	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416122616.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 12:27	Success	-	View
exp_self.20260416121830.013_20260416_121831 Paper: self.20260416121830.013	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416121830.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 12:19	Success	-	View
exp_pytrain.20260416121548.004_20260416_121548 Paper: pytrain.20260416121548.004	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 12:16	Success	-	View
exp_self.20260416121011.012_20260416_121012 Paper: self.20260416121011.012	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416121011.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 12:11	Success	-	View
exp_self.20260416120258.011_20260416_120258 Paper: self.20260416120258.011	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416120258.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 12:04	Success	-	View
exp_self.20260416115544.010_20260416_115544 Paper: self.20260416115544.010	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416115544.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 11:56	Success	-	View
exp_self.20260416114808.009_20260416_114808 Paper: self.20260416114808.009	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416114808.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 11:49	Success	-	View
exp_pytrain.20260416114421.003_20260416_114421 Paper: pytrain.20260416114421.003	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 11:45	Success	-	View
exp_self.20260416114053.008_20260416_114053 Paper: self.20260416114053.008	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416114053.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 11:41	Success	-	View
exp_self.20260416113313.007_20260416_113314 Paper: self.20260416113313.007	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416113313.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 11:34	Success	-	View
exp_self.20260416112524.006_20260416_112524 Paper: self.20260416112524.006	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416112524.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 11:26	Success	-	View
exp_self.20260416111742.005_20260416_111743 Paper: self.20260416111742.005	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416111742.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 11:18	Success	-	View
exp_pytrain.20260416111246.002_20260416_111246 Paper: pytrain.20260416111246.002	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 11:13	Success	-	View
exp_self.20260416111038.004_20260416_111039 Paper: self.20260416111038.004	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416111038.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 11:11	Success	-	View
exp_hf_2604.07882_20260416_110712 Paper: hf_2604.07882	ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video Paper ID: hf_2604.07882 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 11:08	Success	-	View
exp_2604.14147v1_20260416_110447 Paper: 2604.14147v1	ROSE: Retrieval-Oriented Segmentation Enhancement Paper ID: 2604.14147v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-16 11:05	Success	-	View
exp_self.20260416110220.003_20260416_110221 Paper: self.20260416110220.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416110220.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 11:03	Success	-	View
exp_2604.14141v1_20260416_105920 Paper: 2604.14141v1	Geometric Context Transformer for Streaming 3D Reconstruction Paper ID: 2604.14141v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-16 11:00	Success	-	View
exp_self.20260416105200.002_20260416_105200 Paper: self.20260416105200.002	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416105200.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 10:53	Success	-	View
exp_hf_2604.14141_20260416_104847 Paper: hf_2604.14141	Geometric Context Transformer for Streaming 3D Reconstruction Paper ID: hf_2604.14141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 10:49	Success	-	View
exp_2604.14149v1_20260416_104632 Paper: 2604.14149v1	One Token per Highly Selective Frame: Towards Extreme Compression for Long Video Understanding Paper ID: 2604.14149v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-16 10:47	Success	-	View
exp_self.20260416104433.001_20260416_104434 Paper: self.20260416104433.001	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260416104433.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-16 10:45	Success	-	View
exp_hf_2604.11045_20260416_104145 Paper: hf_2604.11045	Sema Code: Decoupling AI Coding Agents into Programmable, Embeddable Infrastructure Paper ID: hf_2604.11045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-16 10:42	Success	-	View
exp_pytrain.20260416103919.001_20260416_103920 Paper: pytrain.20260416103919.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-16 10:40	Success	-	View
exp_self.20260415122901.382_20260415_122902 Paper: self.20260415122901.382	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415122901.382 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 12:30	Success	-	View
exp_self.20260415122136.381_20260415_122136 Paper: self.20260415122136.381	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415122136.381 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 12:22	Success	-	View
exp_pytrain.20260415121901.146_20260415_121901 Paper: pytrain.20260415121901.146	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 12:20	Success	-	View
exp_hf_2604.11177_20260415_121402 Paper: hf_2604.11177	Do Thought Streams Matter? Evaluating Reasoning in Gemini Vision-Language Models for Video Scene Understanding Paper ID: hf_2604.11177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-15 12:15	Success	-	View
exp_self.20260415121200.380_20260415_121200 Paper: self.20260415121200.380	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415121200.380 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 12:13	Success	-	View
exp_self.20260415120429.379_20260415_120429 Paper: self.20260415120429.379	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415120429.379 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 12:05	Success	-	View
exp_self.20260415115656.378_20260415_115657 Paper: self.20260415115656.378	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415115656.378 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 11:57	Success	-	View
exp_self.20260415114924.377_20260415_114925 Paper: self.20260415114924.377	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415114924.377 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 11:50	Success	-	View
exp_pytrain.20260415114658.145_20260415_114658 Paper: pytrain.20260415114658.145	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 11:48	Success	-	View
exp_self.20260415113952.376_20260415_113954 Paper: self.20260415113952.376	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415113952.376 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 11:40	Success	-	View
exp_self.20260415113225.375_20260415_113225 Paper: self.20260415113225.375	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415113225.375 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 11:33	Success	-	View
exp_self.20260415112459.374_20260415_112500 Paper: self.20260415112459.374	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415112459.374 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 11:26	Success	-	View
exp_self.20260415111723.373_20260415_111723 Paper: self.20260415111723.373	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415111723.373 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 11:18	Success	-	View
exp_pytrain.20260415111455.144_20260415_111456 Paper: pytrain.20260415111455.144	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 11:15	Success	-	View
exp_self.20260415110755.372_20260415_110756 Paper: self.20260415110755.372	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415110755.372 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 11:08	Success	-	View
exp_self.20260415110028.371_20260415_110029 Paper: self.20260415110028.371	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415110028.371 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 11:01	Success	-	View
exp_self.20260415105258.370_20260415_105258 Paper: self.20260415105258.370	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415105258.370 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 10:54	Success	-	View
exp_self.20260415104524.369_20260415_104525 Paper: self.20260415104524.369	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415104524.369 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 10:46	Success	-	View
exp_pytrain.20260415104251.143_20260415_104252 Paper: pytrain.20260415104251.143	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 10:43	Success	-	View
exp_self.20260415103550.368_20260415_103551 Paper: self.20260415103550.368	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415103550.368 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 10:36	Success	-	View
exp_self.20260415102822.367_20260415_102823 Paper: self.20260415102822.367	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415102822.367 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 10:29	Success	-	View
exp_self.20260415102054.366_20260415_102055 Paper: self.20260415102054.366	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415102054.366 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 10:21	Success	-	View
exp_self.20260415101323.365_20260415_101323 Paper: self.20260415101323.365	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415101323.365 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 10:14	Success	-	View
exp_pytrain.20260415101049.142_20260415_101050 Paper: pytrain.20260415101049.142	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 10:11	Success	-	View
exp_self.20260415100347.364_20260415_100348 Paper: self.20260415100347.364	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415100347.364 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 10:04	Success	-	View
exp_self.20260415095614.363_20260415_095614 Paper: self.20260415095614.363	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415095614.363 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 09:57	Success	-	View
exp_self.20260415094843.362_20260415_094843 Paper: self.20260415094843.362	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415094843.362 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 09:49	Success	-	View
exp_self.20260415094124.361_20260415_094124 Paper: self.20260415094124.361	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415094124.361 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 09:42	Success	-	View
exp_pytrain.20260415093850.141_20260415_093850 Paper: pytrain.20260415093850.141	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 09:39	Success	-	View
exp_self.20260415093204.360_20260415_093204 Paper: self.20260415093204.360	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415093204.360 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 09:33	Success	-	View
exp_self.20260415092427.359_20260415_092428 Paper: self.20260415092427.359	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415092427.359 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 09:25	Success	-	View
exp_self.20260415091648.358_20260415_091648 Paper: self.20260415091648.358	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415091648.358 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 09:17	Success	-	View
exp_self.20260415090909.357_20260415_090910 Paper: self.20260415090909.357	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415090909.357 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 09:10	Success	-	View
exp_pytrain.20260415090643.140_20260415_090644 Paper: pytrain.20260415090643.140	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 09:07	Success	-	View
exp_self.20260415085933.356_20260415_085934 Paper: self.20260415085933.356	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415085933.356 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 09:00	Success	-	View
exp_self.20260415085202.355_20260415_085203 Paper: self.20260415085202.355	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415085202.355 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 08:53	Success	-	View
exp_self.20260415084420.354_20260415_084420 Paper: self.20260415084420.354	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415084420.354 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 08:45	Success	-	View
exp_self.20260415083648.353_20260415_083648 Paper: self.20260415083648.353	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415083648.353 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 08:37	Success	-	View
exp_pytrain.20260415083420.139_20260415_083420 Paper: pytrain.20260415083420.139	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 08:35	Success	-	View
exp_self.20260415082712.352_20260415_082712 Paper: self.20260415082712.352	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415082712.352 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 08:28	Success	-	View
exp_self.20260415081940.351_20260415_081940 Paper: self.20260415081940.351	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415081940.351 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 08:20	Success	-	View
exp_self.20260415081210.350_20260415_081211 Paper: self.20260415081210.350	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415081210.350 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 08:13	Success	-	View
exp_self.20260415080427.349_20260415_080427 Paper: self.20260415080427.349	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415080427.349 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 08:05	Success	-	View
exp_pytrain.20260415080158.138_20260415_080159 Paper: pytrain.20260415080158.138	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 08:03	Success	-	View
exp_self.20260415075603.348_20260415_075603 Paper: self.20260415075603.348	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415075603.348 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 07:57	Success	-	View
exp_self.20260415074825.347_20260415_074826 Paper: self.20260415074825.347	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415074825.347 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 07:49	Success	-	View
exp_self.20260415074044.346_20260415_074045 Paper: self.20260415074044.346	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415074044.346 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 07:41	Success	-	View
exp_self.20260415073310.345_20260415_073311 Paper: self.20260415073310.345	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415073310.345 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 07:34	Success	-	View
exp_pytrain.20260415073032.137_20260415_073033 Paper: pytrain.20260415073032.137	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 07:31	Success	-	View
exp_self.20260415072446.344_20260415_072447 Paper: self.20260415072446.344	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415072446.344 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 07:25	Success	-	View
exp_self.20260415071713.343_20260415_071713 Paper: self.20260415071713.343	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415071713.343 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 07:18	Success	-	View
exp_self.20260415070911.342_20260415_070911 Paper: self.20260415070911.342	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415070911.342 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 07:10	Success	-	View
exp_self.20260415070116.341_20260415_070116 Paper: self.20260415070116.341	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415070116.341 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 07:02	Success	-	View
exp_pytrain.20260415065845.136_20260415_065845 Paper: pytrain.20260415065845.136	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 06:59	Success	-	View
exp_self.20260415065302.340_20260415_065302 Paper: self.20260415065302.340	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415065302.340 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 06:54	Success	-	View
exp_self.20260415064521.339_20260415_064522 Paper: self.20260415064521.339	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415064521.339 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 06:46	Success	-	View
exp_self.20260415063742.338_20260415_063742 Paper: self.20260415063742.338	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415063742.338 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 06:38	Success	-	View
exp_self.20260415063011.337_20260415_063012 Paper: self.20260415063011.337	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415063011.337 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 06:31	Success	-	View
exp_pytrain.20260415062710.135_20260415_062711 Paper: pytrain.20260415062710.135	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 06:28	Success	-	View
exp_self.20260415062103.336_20260415_062103 Paper: self.20260415062103.336	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415062103.336 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 06:22	Success	-	View
exp_self.20260415061326.335_20260415_061326 Paper: self.20260415061326.335	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415061326.335 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 06:14	Success	-	View
exp_self.20260415060543.334_20260415_060544 Paper: self.20260415060543.334	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415060543.334 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 06:06	Success	-	View
exp_self.20260415055813.333_20260415_055813 Paper: self.20260415055813.333	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415055813.333 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 05:59	Success	-	View
exp_pytrain.20260415055550.134_20260415_055551 Paper: pytrain.20260415055550.134	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 05:56	Success	-	View
exp_self.20260415054842.332_20260415_054842 Paper: self.20260415054842.332	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415054842.332 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 05:49	Success	-	View
exp_self.20260415054105.331_20260415_054105 Paper: self.20260415054105.331	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415054105.331 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 05:42	Success	-	View
exp_self.20260415053329.330_20260415_053330 Paper: self.20260415053329.330	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415053329.330 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 05:34	Success	-	View
exp_self.20260415052556.329_20260415_052556 Paper: self.20260415052556.329	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415052556.329 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 05:26	Success	-	View
exp_pytrain.20260415052333.133_20260415_052333 Paper: pytrain.20260415052333.133	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 05:24	Success	-	View
exp_self.20260415051633.328_20260415_051634 Paper: self.20260415051633.328	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415051633.328 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 05:17	Success	-	View
exp_self.20260415050907.327_20260415_050907 Paper: self.20260415050907.327	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415050907.327 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 05:10	Success	-	View
exp_self.20260415050139.326_20260415_050139 Paper: self.20260415050139.326	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415050139.326 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 05:02	Success	-	View
exp_cr_10.3390_aichem1020007_20260415_045718 Paper: cr_10.3390_aichem1020007	Active Learning on Protein Language Model Embeddings Accelerates Rubisco Variant Discovery for Desired Traits Paper ID: cr_10.3390_aichem1020007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 04:58	Success	-	View
exp_self.20260415045408.325_20260415_045408 Paper: self.20260415045408.325	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415045408.325 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 04:55	Success	-	View
exp_pytrain.20260415045141.132_20260415_045142 Paper: pytrain.20260415045141.132	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 04:52	Success	-	View
exp_self.20260415044451.324_20260415_044451 Paper: self.20260415044451.324	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415044451.324 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 04:45	Success	-	View
exp_self.20260415043723.323_20260415_043724 Paper: self.20260415043723.323	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415043723.323 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 04:38	Success	-	View
exp_self.20260415042956.322_20260415_042956 Paper: self.20260415042956.322	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415042956.322 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 04:30	Success	-	View
exp_self.20260415042232.321_20260415_042232 Paper: self.20260415042232.321	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415042232.321 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 04:23	Success	-	View
exp_pytrain.20260415042004.131_20260415_042005 Paper: pytrain.20260415042004.131	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 04:21	Success	-	View
exp_self.20260415040230.320_20260415_040230 Paper: self.20260415040230.320	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415040230.320 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 04:03	Success	-	View
exp_self.20260415035455.319_20260415_035455 Paper: self.20260415035455.319	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415035455.319 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 03:55	Success	-	View
exp_self.20260415034725.318_20260415_034725 Paper: self.20260415034725.318	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415034725.318 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 03:48	Success	-	View
exp_self.20260415033959.317_20260415_033959 Paper: self.20260415033959.317	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415033959.317 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 03:41	Success	-	View
exp_pytrain.20260415033729.130_20260415_033730 Paper: pytrain.20260415033729.130	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 03:38	Success	-	View
exp_self.20260415033150.316_20260415_033150 Paper: self.20260415033150.316	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415033150.316 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 03:32	Success	-	View
exp_self.20260415032424.315_20260415_032424 Paper: self.20260415032424.315	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415032424.315 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 03:25	Success	-	View
exp_self.20260415031642.314_20260415_031643 Paper: self.20260415031642.314	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415031642.314 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 03:17	Success	-	View
exp_self.20260415030907.313_20260415_030908 Paper: self.20260415030907.313	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415030907.313 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 03:10	Success	-	View
exp_pytrain.20260415030533.129_20260415_030533 Paper: pytrain.20260415030533.129	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 03:06	Success	-	View
exp_self.20260415030126.312_20260415_030126 Paper: self.20260415030126.312	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415030126.312 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 03:02	Success	-	View
exp_self.20260415025346.311_20260415_025346 Paper: self.20260415025346.311	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415025346.311 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 02:54	Success	-	View
exp_self.20260415024618.310_20260415_024618 Paper: self.20260415024618.310	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415024618.310 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 02:47	Success	-	View
exp_self.20260415023856.309_20260415_023856 Paper: self.20260415023856.309	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415023856.309 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 02:39	Success	-	View
exp_pytrain.20260415023419.128_20260415_023419 Paper: pytrain.20260415023419.128	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 02:35	Success	-	View
exp_self.20260415023214.308_20260415_023214 Paper: self.20260415023214.308	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415023214.308 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 02:33	Success	-	View
exp_cr_10.1038_s41524-026-01995-1_20260415_022905 Paper: cr_10.1038_s41524-026-01995-1	High-throughput parameter estimation from experimental data using Bayesian Inference with accelerated sampling Paper ID: cr_10.1038_s41524-026-01995-1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-15 02:30	Success	-	View
exp_self.20260415022339.307_20260415_022339 Paper: self.20260415022339.307	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415022339.307 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 02:24	Success	-	View
exp_self.20260415021609.306_20260415_021609 Paper: self.20260415021609.306	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415021609.306 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 02:17	Success	-	View
exp_self.20260415020842.305_20260415_020842 Paper: self.20260415020842.305	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415020842.305 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 02:09	Success	-	View
exp_pytrain.20260415020258.127_20260415_020258 Paper: pytrain.20260415020258.127	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 02:04	Success	-	View
exp_self.20260415020104.304_20260415_020105 Paper: self.20260415020104.304	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415020104.304 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 02:02	Success	-	View
exp_self.20260415013843.303_20260415_013844 Paper: self.20260415013843.303	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415013843.303 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 01:39	Success	-	View
exp_hf_2604.12373_20260415_013311 Paper: hf_2604.12373	Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness Paper ID: hf_2604.12373 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-15 01:34	Success	-	View
exp_self.20260415013113.302_20260415_013113 Paper: self.20260415013113.302	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415013113.302 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 01:32	Success	-	View
exp_pytrain.20260415012842.126_20260415_012843 Paper: pytrain.20260415012842.126	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 01:29	Success	-	View
exp_self.20260415012146.301_20260415_012147 Paper: self.20260415012146.301	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415012146.301 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 01:22	Success	-	View
exp_self.20260415011423.300_20260415_011423 Paper: self.20260415011423.300	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415011423.300 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 01:15	Success	-	View
exp_self.20260415010700.299_20260415_010701 Paper: self.20260415010700.299	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415010700.299 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 01:08	Success	-	View
exp_self.20260415005937.298_20260415_005937 Paper: self.20260415005937.298	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415005937.298 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 01:00	Success	-	View
exp_pytrain.20260415005711.125_20260415_005711 Paper: pytrain.20260415005711.125	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 00:58	Success	-	View
exp_self.20260415005026.297_20260415_005027 Paper: self.20260415005026.297	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415005026.297 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 00:51	Success	-	View
exp_self.20260415004300.296_20260415_004300 Paper: self.20260415004300.296	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415004300.296 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 00:44	Success	-	View
exp_self.20260415003539.295_20260415_003540 Paper: self.20260415003539.295	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415003539.295 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 00:36	Success	-	View
exp_self.20260415002819.294_20260415_002819 Paper: self.20260415002819.294	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415002819.294 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 00:29	Success	-	View
exp_pytrain.20260415002552.124_20260415_002552 Paper: pytrain.20260415002552.124	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-15 00:26	Success	-	View
exp_self.20260415001906.293_20260415_001907 Paper: self.20260415001906.293	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415001906.293 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 00:20	Success	-	View
exp_self.20260415001139.292_20260415_001140 Paper: self.20260415001139.292	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415001139.292 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 00:12	Success	-	View
exp_self.20260415000409.291_20260415_000409 Paper: self.20260415000409.291	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260415000409.291 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-15 00:05	Success	-	View
exp_hf_2604.05072_20260414_235948 Paper: hf_2604.05072	Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling Paper ID: hf_2604.05072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-15 00:00	Success	-	View
exp_self.20260414235641.290_20260414_235642 Paper: self.20260414235641.290	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414235641.290 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 23:57	Success	-	View
exp_pytrain.20260414235420.123_20260414_235420 Paper: pytrain.20260414235420.123	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 23:55	Success	-	View
exp_self.20260414234722.289_20260414_234723 Paper: self.20260414234722.289	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414234722.289 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 23:48	Success	-	View
exp_self.20260414234003.288_20260414_234003 Paper: self.20260414234003.288	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414234003.288 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 23:41	Success	-	View
exp_self.20260414233240.287_20260414_233241 Paper: self.20260414233240.287	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414233240.287 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 23:33	Success	-	View
exp_self.20260414232519.286_20260414_232520 Paper: self.20260414232519.286	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414232519.286 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 23:26	Success	-	View
exp_pytrain.20260414232250.122_20260414_232251 Paper: pytrain.20260414232250.122	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 23:23	Success	-	View
exp_self.20260414231559.285_20260414_231559 Paper: self.20260414231559.285	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414231559.285 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 23:17	Success	-	View
exp_self.20260414230832.284_20260414_230833 Paper: self.20260414230832.284	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414230832.284 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 23:09	Success	-	View
exp_self.20260414230115.283_20260414_230115 Paper: self.20260414230115.283	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414230115.283 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 23:02	Success	-	View
exp_hf_2604.12627_20260414_225652 Paper: hf_2604.12627	KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper ID: hf_2604.12627 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 22:57	Success	-	View
exp_self.20260414225345.282_20260414_225346 Paper: self.20260414225345.282	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414225345.282 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 22:54	Success	-	View
exp_pytrain.20260414225122.121_20260414_225122 Paper: pytrain.20260414225122.121	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 22:52	Success	-	View
exp_self.20260414224434.281_20260414_224434 Paper: self.20260414224434.281	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414224434.281 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 22:45	Success	-	View
exp_self.20260414223714.280_20260414_223714 Paper: self.20260414223714.280	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414223714.280 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 22:38	Success	-	View
exp_hf_2604.12322_20260414_223358 Paper: hf_2604.12322	Self-Adversarial One Step Generation via Condition Shifting Paper ID: hf_2604.12322 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 22:35	Success	-	View
exp_self.20260414222945.279_20260414_222945 Paper: self.20260414222945.279	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414222945.279 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 22:30	Success	-	View
exp_self.20260414222221.278_20260414_222222 Paper: self.20260414222221.278	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414222221.278 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 22:23	Success	-	View
exp_pytrain.20260414221954.120_20260414_221954 Paper: pytrain.20260414221954.120	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 22:20	Success	-	View
exp_hf_2604.12890_20260414_221711 Paper: hf_2604.12890	Towards Long-horizon Agentic Multimodal Search Paper ID: hf_2604.12890 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 22:18	Success	-	View
exp_self.20260414221253.277_20260414_221254 Paper: self.20260414221253.277	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414221253.277 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 22:13	Success	-	View
exp_self.20260414220527.276_20260414_220528 Paper: self.20260414220527.276	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414220527.276 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 22:06	Success	-	View
exp_hf_2604.13010_20260414_220207 Paper: hf_2604.13010	Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation Paper ID: hf_2604.13010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 22:03	Success	-	View
exp_hf_2604.12374_20260414_215840 Paper: hf_2604.12374	Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper ID: hf_2604.12374 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 21:59	Success	-	View
exp_self.20260414215643.275_20260414_215643 Paper: self.20260414215643.275	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414215643.275 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 21:57	Success	-	View
exp_self.20260414214915.274_20260414_214916 Paper: self.20260414214915.274	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414214915.274 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 21:50	Success	-	View
exp_pytrain.20260414214646.119_20260414_214646 Paper: pytrain.20260414214646.119	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 21:47	Success	-	View
exp_hf_2604.08865_20260414_214149 Paper: hf_2604.08865	SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks Paper ID: hf_2604.08865 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 21:42	Success	-	View
exp_self.20260414213952.273_20260414_213952 Paper: self.20260414213952.273	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414213952.273 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 21:40	Success	-	View
exp_self.20260414213231.272_20260414_213231 Paper: self.20260414213231.272	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414213231.272 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 21:33	Success	-	View
exp_self.20260414212508.271_20260414_212508 Paper: self.20260414212508.271	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414212508.271 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 21:26	Success	-	View
exp_2604.13024v1_20260414_212157 Paper: 2604.13024v1	CLAD: Efficient Log Anomaly Detection Directly on Compressed Representations Paper ID: 2604.13024v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-14 21:22	Success	-	View
exp_self.20260414211744.270_20260414_211745 Paper: self.20260414211744.270	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414211744.270 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 21:18	Success	-	View
exp_pytrain.20260414211516.118_20260414_211516 Paper: pytrain.20260414211516.118	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 21:16	Success	-	View
exp_self.20260414211059.269_20260414_211059 Paper: self.20260414211059.269	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414211059.269 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 21:12	Success	-	View
exp_2604.13035v1_20260414_210746 Paper: 2604.13035v1	SceneCritic: A Symbolic Evaluator for 3D Indoor Scene Synthesis Paper ID: 2604.13035v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-14 21:08	Success	-	View
exp_self.20260414210042.268_20260414_210043 Paper: self.20260414210042.268	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414210042.268 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 21:01	Success	-	View
exp_self.20260414205319.267_20260414_205319 Paper: self.20260414205319.267	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414205319.267 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 20:54	Success	-	View
exp_self.20260414204559.266_20260414_204600 Paper: self.20260414204559.266	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414204559.266 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 20:47	Success	-	View
exp_pytrain.20260414204330.117_20260414_204331 Paper: pytrain.20260414204330.117	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 20:44	Success	-	View
exp_self.20260414203634.265_20260414_203634 Paper: self.20260414203634.265	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414203634.265 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 20:37	Success	-	View
exp_self.20260414202909.264_20260414_202909 Paper: self.20260414202909.264	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414202909.264 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 20:30	Success	-	View
exp_self.20260414202147.263_20260414_202147 Paper: self.20260414202147.263	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414202147.263 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 20:22	Success	-	View
exp_self.20260414201421.262_20260414_201422 Paper: self.20260414201421.262	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414201421.262 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 20:15	Success	-	View
exp_pytrain.20260414201154.116_20260414_201154 Paper: pytrain.20260414201154.116	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 20:12	Success	-	View
exp_self.20260414200511.261_20260414_200511 Paper: self.20260414200511.261	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414200511.261 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 20:06	Success	-	View
exp_self.20260414195748.260_20260414_195749 Paper: self.20260414195748.260	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414195748.260 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 19:58	Success	-	View
exp_self.20260414195026.259_20260414_195027 Paper: self.20260414195026.259	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414195026.259 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 19:51	Success	-	View
exp_gh_leitoooatr_PythonVectorDB_20260414_194717 Paper: gh_leitoooatr_PythonVectorDB	leitoooatr/PythonVectorDB Paper ID: gh_leitoooatr_PythonVectorDB - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recov...	04-14 19:48	Success	-	View
exp_self.20260414194238.258_20260414_194238 Paper: self.20260414194238.258	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414194238.258 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 19:43	Success	-	View
exp_pytrain.20260414194017.115_20260414_194017 Paper: pytrain.20260414194017.115	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 19:41	Success	-	View
exp_self.20260414193322.257_20260414_193322 Paper: self.20260414193322.257	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414193322.257 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 19:34	Success	-	View
exp_self.20260414192600.256_20260414_192601 Paper: self.20260414192600.256	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414192600.256 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 19:27	Success	-	View
exp_self.20260414191843.255_20260414_191843 Paper: self.20260414191843.255	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414191843.255 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 19:19	Success	-	View
exp_gh_Sheaantisocial810_pytorch-mobilenet-efficiency_20260414_191420 Paper: gh_Sheaantisocial810_pytorch-mobilenet-efficiency	Sheaantisocial810/pytorch-mobilenet-efficiency Paper ID: gh_Sheaantisocial810_pytorch-mobilenet-efficiency - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - E...	04-14 19:15	Success	-	View
exp_self.20260414191114.254_20260414_191114 Paper: self.20260414191114.254	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414191114.254 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 19:12	Success	-	View
exp_pytrain.20260414190847.114_20260414_190848 Paper: pytrain.20260414190847.114	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 19:09	Success	-	View
exp_self.20260414190159.253_20260414_190159 Paper: self.20260414190159.253	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414190159.253 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 19:03	Success	-	View
exp_self.20260414185435.252_20260414_185436 Paper: self.20260414185435.252	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414185435.252 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 18:55	Success	-	View
exp_self.20260414184714.251_20260414_184715 Paper: self.20260414184714.251	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414184714.251 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 18:48	Success	-	View
exp_self.20260414183950.250_20260414_183950 Paper: self.20260414183950.250	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414183950.250 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 18:40	Success	-	View
exp_pytrain.20260414183730.113_20260414_183731 Paper: pytrain.20260414183730.113	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 18:38	Success	-	View
exp_self.20260414183210.249_20260414_183211 Paper: self.20260414183210.249	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414183210.249 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 18:33	Success	-	View
exp_self.20260414182454.248_20260414_182454 Paper: self.20260414182454.248	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414182454.248 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 18:25	Success	-	View
exp_self.20260414181734.247_20260414_181735 Paper: self.20260414181734.247	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414181734.247 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 18:18	Success	-	View
exp_self.20260414181015.246_20260414_181015 Paper: self.20260414181015.246	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414181015.246 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 18:11	Success	-	View
exp_hf_2604.04385_20260414_180721 Paper: hf_2604.04385	How Alignment Routes: Localizing, Scaling, and Controlling Policy Circuits in Language Models Paper ID: hf_2604.04385 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 18:08	Success	-	View
exp_pytrain.20260414180526.112_20260414_180526 Paper: pytrain.20260414180526.112	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 18:06	Success	-	View
exp_self.20260414180331.245_20260414_180332 Paper: self.20260414180331.245	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414180331.245 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 18:04	Success	-	View
exp_hf_2604.11004_20260414_180040 Paper: hf_2604.11004	Panoptic Pairwise Distortion Graph Paper ID: hf_2604.11004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 18:01	Success	-	View
exp_cr_10.3390_axioms15040289_20260414_175754 Paper: cr_10.3390_axioms15040289	Amortized Parameter Inference for the Arbitrary-Order Hidden Markov Model Paper ID: cr_10.3390_axioms15040289 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovere...	04-14 17:58	Success	-	View
exp_hf_2604.10539_20260414_175532 Paper: hf_2604.10539	IceCache: Memory-efficient KV-cache Management for Long-Sequence LLMs Paper ID: hf_2604.10539 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 17:56	Success	-	View
exp_self.20260414175335.244_20260414_175336 Paper: self.20260414175335.244	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414175335.244 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 17:54	Success	-	View
exp_pytrain.20260414173208.111_20260414_173234 Paper: pytrain.20260414173208.111	AST-Based Package Type Coverage Analyzer This benchmark tests the ability to construct a static analysis tool using Python's standard library. The goal is to validate type annotation coverage across a dynamically generated Python package structure without executing the target code...	04-14 17:33	Success	-	View
exp_self.20260414171027.243_20260414_171046 Paper: self.20260414171027.243	Benchmark: SSM Memory Policy Stress Test This benchmark evaluates the impact of a disciplined memory management strategy on State Space Model (SSM) throughput and VRAM consumption. Hypothesis Applying a chunked execution strategy (disciplined memory policy) to SSM layers significa...	04-14 17:11	Success	-	View
exp_pytrain.20260414161649.110_20260414_161727 Paper: pytrain.20260414161649.110	Generic Resource Loader Benchmark This benchmark demonstrates a robust implementation of a generic resource loader using Python's modern type hinting system (PEP 585/PEP 591) and the `importlib.resources` API. Objective To verify that a generic class `ResourceLoader[T]` can...	04-14 16:18	Success	-	View
exp_self.20260414155420.242_20260414_155447 Paper: self.20260414155420.242	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414155420.242 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 15:55	Success	-	View
exp_pytrain.20260414150311.109_20260414_150344 Paper: pytrain.20260414150311.109	Strictly Typed Plugin Registry with Runtime Validation Overview This benchmark validates a robust plugin architecture implementation using Python's `typing.Protocol`. The system enforces interface compliance at both static (linting/type checking) and dynamic (runtime) levels. Problem Statement...	04-14 15:04	Success	-	View
exp_self.20260414144028.241_20260414_144110 Paper: self.20260414144028.241	Self-directed benchmark: ssm strategy stress test This benchmark evaluates the performance characteristics and memory efficiency of a Selective State Space Model (SSM) strategy against a standard Transformer (Attention) baseline. Hypothesis Applying SSM with a disciplined memory policy imp...	04-14 14:42	Success	-	View
exp_pytrain.20260414134700.108_20260414_134734 Paper: pytrain.20260414134700.108	Python Skill Fallback Title: Type-Validated ZipApp Packager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 13:48	Success	-	View
exp_self.20260414132528.240_20260414_132628 Paper: self.20260414132528.240	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414132528.240 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 13:27	Success	-	View
exp_pytrain.20260414123136.107_20260414_123157 Paper: pytrain.20260414123136.107	Strictly Typed Plugin Architecture with Dynamic Discovery Overview This benchmark demonstrates a robust, type-safe plugin architecture using Python's standard library. It leverages `typing.Protocol` for structural interface enforcement and `types.ModuleType` for dynamic module generation and intro...	04-14 12:32	Success	-	View
exp_self.20260414120714.239_20260414_120732 Paper: self.20260414120714.239	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying SSM (State Space Model) strategies with a disciplined memory policy improves throughput under strict 8GB VRAM constraints. Context SSMs, such as Mamba, rely on efficient recurrence mecha...	04-14 12:09	Success	-	View
exp_pytrain.20260414110652.106_20260414_110752 Paper: pytrain.20260414110652.106	Python Skill Fallback Title: Strictly Typed Data Pipeline with Dynamic Registration - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 11:08	Success	-	View
exp_self.20260414103906.238_20260414_103943 Paper: self.20260414103906.238	SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy and dynamic precision can significantly improve throughput under constrained 8GB VRAM environments. I...	04-14 10:40	Success	-	View
exp_pytrain.20260414093416.105_20260414_093513 Paper: pytrain.20260414093416.105	Dynamic Protocol-Compliant Plugin Loader This benchmark evaluates a system's ability to dynamically construct Python package structures in a volatile environment and enforce strict structural subtyping using `typing.Protocol`. It tests the candidate's capability to manage temporar...	04-14 09:36	Success	-	View
exp_self.20260414090917.237_20260414_090943 Paper: self.20260414090917.237	SSM Strategy Stress Test This benchmark evaluates the performance of State Space Models (SSMs) under varying memory policies. Hypothesis Applying an SSM with a disciplined memory policy (using caching and dynamic precision) improves throughput and efficiency under...	04-14 09:11	Success	-	View
exp_pytrain.20260414081013.104_20260414_081042 Paper: pytrain.20260414081013.104	Python Skill Fallback Title: Strictly-Typed Modular Plugin Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 08:11	Success	-	View
exp_self.20260414074311.236_20260414_074343 Paper: self.20260414074311.236	SSM Strategy Stress Test Benchmark Overview This benchmark tests the hypothesis that State Space Models (SSMs) with a disciplined memory policy (specifically selective state spaces like Mamba) offer superior throughput and VRAM efficiency compared to standard attention mecha...	04-14 07:45	Success	-	View
exp_pytrain.20260414064022.103_20260414_064045 Paper: pytrain.20260414064022.103	Python Skill Fallback Title: Runtime Plugin Loader with Protocol Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 06:41	Success	-	View
exp_self.20260414061348.235_20260414_061444 Paper: self.20260414061348.235	SSM Strategy Stress Test Benchmark This benchmark evaluates the performance of a State Space Model (SSM) strategy against a traditional ablated baseline. Specifically, it tests the hypothesis that applying SSMs with a disciplined memory policy (using dynamic precision and re...	04-14 06:15	Success	-	View
exp_pytrain.20260414051045.102_20260414_051117 Paper: pytrain.20260414051045.102	Python Skill Fallback Title: PEP 695 Generic Dependency Container with Runtime Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 05:12	Success	-	View
exp_self.20260414044155.234_20260414_044252 Paper: self.20260414044155.234	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that a State Space Model (SSM) implementation, when combined with a disciplined memory policy (specifically dynamic precision mixing and state caching), yields superior throughput and lower VRAM usage...	04-14 04:43	Success	-	View
exp_pytrain.20260414033228.101_20260414_033254 Paper: pytrain.20260414033228.101	Python Skill Fallback Title: Generic Plugin Registry with API Surface Hygiene - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 03:33	Success	-	View
exp_2604.11807v1_20260414_031604 Paper: 2604.11807v1	Physics-Informed State Space Models for Reliable Solar Irradiance Forecasting in Off-Grid Systems Paper ID: 2604.11807v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-14 03:17	Success	-	View
exp_pytrain.20260414025339.100_20260414_025339 Paper: pytrain.20260414025339.100	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 02:54	Success	-	View
exp_self.20260414024923.233_20260414_024923 Paper: self.20260414024923.233	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414024923.233 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 02:50	Success	-	View
exp_self.20260414024147.232_20260414_024148 Paper: self.20260414024147.232	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414024147.232 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 02:42	Success	-	View
exp_pytrain.20260414022017.099_20260414_022118 Paper: pytrain.20260414022017.099	Python Skill Fallback Title: Typing-Driven Plugin Registry with Namespace Control - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 02:22	Success	-	View
exp_self.20260414015142.231_20260414_015142 Paper: self.20260414015142.231	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414015142.231 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 01:52	Success	-	View
exp_self.20260414014353.230_20260414_014353 Paper: self.20260414014353.230	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414014353.230 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 01:44	Success	-	View
exp_pytrain.20260414014109.098_20260414_014109 Paper: pytrain.20260414014109.098	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 01:42	Success	-	View
exp_self.20260414013536.229_20260414_013536 Paper: self.20260414013536.229	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414013536.229 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 01:36	Success	-	View
exp_self.20260414012749.228_20260414_012749 Paper: self.20260414012749.228	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414012749.228 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 01:28	Success	-	View
exp_self.20260414012000.227_20260414_012000 Paper: self.20260414012000.227	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414012000.227 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 01:21	Success	-	View
exp_self.20260414011214.226_20260414_011214 Paper: self.20260414011214.226	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414011214.226 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 01:13	Success	-	View
exp_pytrain.20260414010934.097_20260414_010934 Paper: pytrain.20260414010934.097	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 01:10	Success	-	View
exp_self.20260414010507.225_20260414_010508 Paper: self.20260414010507.225	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414010507.225 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 01:06	Success	-	View
exp_self.20260414005754.224_20260414_005754 Paper: self.20260414005754.224	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414005754.224 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 00:58	Success	-	View
exp_self.20260414005013.223_20260414_005014 Paper: self.20260414005013.223	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414005013.223 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 00:51	Success	-	View
exp_self.20260414004231.222_20260414_004232 Paper: self.20260414004231.222	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414004231.222 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 00:43	Success	-	View
exp_pytrain.20260414003726.096_20260414_003727 Paper: pytrain.20260414003726.096	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 00:38	Success	-	View
exp_self.20260414003516.221_20260414_003516 Paper: self.20260414003516.221	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414003516.221 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 00:36	Success	-	View
exp_self.20260414002727.220_20260414_002728 Paper: self.20260414002727.220	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414002727.220 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 00:28	Success	-	View
exp_self.20260414001944.219_20260414_001945 Paper: self.20260414001944.219	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414001944.219 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 00:20	Success	-	View
exp_self.20260414001201.218_20260414_001201 Paper: self.20260414001201.218	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414001201.218 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 00:13	Success	-	View
exp_hf_2604.10333_20260414_000835 Paper: hf_2604.10333	Zero-shot World Models Are Developmentally Efficient Learners Paper ID: hf_2604.10333 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-14 00:09	Success	-	View
exp_pytrain.20260414000401.095_20260414_000402 Paper: pytrain.20260414000401.095	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-14 00:05	Success	-	View
exp_self.20260414000152.217_20260414_000152 Paper: self.20260414000152.217	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260414000152.217 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-14 00:02	Success	-	View
exp_self.20260413235410.216_20260413_235411 Paper: self.20260413235410.216	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413235410.216 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 23:55	Success	-	View
exp_hf_2604.10030_20260413_235112 Paper: hf_2604.10030	Prompt Relay: Inference-Time Temporal Control for Multi-Event Video Generation Paper ID: hf_2604.10030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 23:52	Success	-	View
exp_self.20260413234354.215_20260413_234354 Paper: self.20260413234354.215	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413234354.215 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 23:44	Success	-	View
exp_self.20260413233607.214_20260413_233607 Paper: self.20260413233607.214	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413233607.214 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 23:37	Success	-	View
exp_pytrain.20260413233109.094_20260413_233110 Paper: pytrain.20260413233109.094	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 23:32	Success	-	View
exp_self.20260413232854.213_20260413_232855 Paper: self.20260413232854.213	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413232854.213 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 23:29	Success	-	View
exp_self.20260413232118.212_20260413_232118 Paper: self.20260413232118.212	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413232118.212 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 23:22	Success	-	View
exp_hf_2604.09212_20260413_231753 Paper: hf_2604.09212	SPASM: Stable Persona-driven Agent Simulation for Multi-turn Dialogue Generation Paper ID: hf_2604.09212 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 23:18	Success	-	View
exp_2604.11808v1_20260413_231528 Paper: 2604.11808v1	Pair2Scene: Learning Local Object Relations for Procedural Scene Generation Paper ID: 2604.11808v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-13 23:16	Success	-	View
exp_2604.11804v1_20260413_231109 Paper: 2604.11804v1	OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper ID: 2604.11804v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-13 23:12	Success	-	View
exp_self.20260413230858.211_20260413_230858 Paper: self.20260413230858.211	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413230858.211 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 23:10	Success	-	View
exp_hf_2604.11035_20260413_230555 Paper: hf_2604.11035	Introspective Diffusion Language Models Paper ID: hf_2604.11035 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 23:06	Success	-	View
exp_cr_10.1186_s42400-026-00589-0_20260413_230302 Paper: cr_10.1186_s42400-026-00589-0	VulSCC: image-based vulnerability detection with SPP-CNN and code large language model Paper ID: cr_10.1186_s42400-026-00589-0 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-13 23:04	Success	-	View
exp_pytrain.20260413225641.093_20260413_225729 Paper: pytrain.20260413225641.093	Strictly-Typed Dynamic Module Loader This benchmark evaluates a robust, strictly-typed plugin architecture that dynamically discovers and imports modules at runtime without hardcoded imports. It simulates a high-performance plugin system where: 1. Dynamic Discovery: A `Plu...	04-13 22:58	Success	-	View
exp_self.20260413224201.210_20260413_224201 Paper: self.20260413224201.210	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413224201.210 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 22:43	Success	-	View
exp_self.20260413223424.209_20260413_223424 Paper: self.20260413223424.209	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413223424.209 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 22:35	Success	-	View
exp_self.20260413222649.208_20260413_222650 Paper: self.20260413222649.208	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413222649.208 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 22:27	Success	-	View
exp_self.20260413221917.207_20260413_221918 Paper: self.20260413221917.207	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413221917.207 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 22:20	Success	-	View
exp_pytrain.20260413221639.092_20260413_221639 Paper: pytrain.20260413221639.092	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 22:17	Success	-	View
exp_hf_2604.10098_20260413_221348 Paper: hf_2604.10098	Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation Paper ID: hf_2604.10098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 22:14	Success	-	View
exp_self.20260413221026.206_20260413_221027 Paper: self.20260413221026.206	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413221026.206 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 22:11	Success	-	View
exp_2604.11585v1_20260413_220705 Paper: 2604.11585v1	GeomPrompt: Geometric Prompt Learning for RGB-D Semantic Segmentation Under Missing and Degraded Depth Paper ID: 2604.11585v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-13 22:08	Success	-	View
exp_self.20260413220238.205_20260413_220238 Paper: self.20260413220238.205	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413220238.205 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 22:03	Success	-	View
exp_self.20260413215458.204_20260413_215459 Paper: self.20260413215458.204	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413215458.204 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 21:56	Success	-	View
exp_2604.11590v1_20260413_215204 Paper: 2604.11590v1	Learning Robustness at Test-Time from a Non-Robust Teacher Paper ID: 2604.11590v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-13 21:53	Success	-	View
exp_self.20260413214506.203_20260413_214506 Paper: self.20260413214506.203	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413214506.203 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 21:46	Success	-	View
exp_pytrain.20260413214228.091_20260413_214229 Paper: pytrain.20260413214228.091	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 21:43	Success	-	View
exp_hf_2604.11804_20260413_213941 Paper: hf_2604.11804	OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper ID: hf_2604.11804 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 21:40	Success	-	View
exp_self.20260413213625.202_20260413_213625 Paper: self.20260413213625.202	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413213625.202 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 21:37	Success	-	View
exp_self.20260413212842.201_20260413_212843 Paper: self.20260413212842.201	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413212842.201 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 21:29	Success	-	View
exp_self.20260413212107.200_20260413_212108 Paper: self.20260413212107.200	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413212107.200 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 21:22	Success	-	View
exp_self.20260413211337.199_20260413_211338 Paper: self.20260413211337.199	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413211337.199 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 21:14	Success	-	View
exp_pytrain.20260413211102.090_20260413_211102 Paper: pytrain.20260413211102.090	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 21:12	Success	-	View
exp_self.20260413210407.198_20260413_210407 Paper: self.20260413210407.198	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413210407.198 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 21:05	Success	-	View
exp_self.20260413205635.197_20260413_205636 Paper: self.20260413205635.197	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413205635.197 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 20:57	Success	-	View
exp_2604.10556v1_20260413_205104 Paper: 2604.10556v1	Lost in Diffusion: Uncovering Hallucination Patterns and Failure Modes in Diffusion Large Language Models Paper ID: 2604.10556v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-13 20:52	Success	-	View
exp_self.20260413204857.196_20260413_204857 Paper: self.20260413204857.196	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413204857.196 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 20:50	Success	-	View
exp_self.20260413204123.195_20260413_204124 Paper: self.20260413204123.195	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413204123.195 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 20:42	Success	-	View
exp_pytrain.20260413203855.089_20260413_203856 Paper: pytrain.20260413203855.089	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 20:39	Success	-	View
exp_self.20260413203349.194_20260413_203349 Paper: self.20260413203349.194	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413203349.194 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 20:34	Success	-	View
exp_self.20260413202457.193_20260413_202457 Paper: self.20260413202457.193	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413202457.193 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 20:26	Success	-	View
exp_self.20260413201653.192_20260413_201653 Paper: self.20260413201653.192	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413201653.192 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 20:17	Success	-	View
exp_self.20260413200910.191_20260413_200910 Paper: self.20260413200910.191	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413200910.191 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 20:10	Success	-	View
exp_pytrain.20260413200636.088_20260413_200637 Paper: pytrain.20260413200636.088	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 20:07	Success	-	View
exp_self.20260413195935.190_20260413_195935 Paper: self.20260413195935.190	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413195935.190 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 20:00	Success	-	View
exp_self.20260413195206.189_20260413_195207 Paper: self.20260413195206.189	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413195206.189 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 19:53	Success	-	View
exp_self.20260413194437.188_20260413_194437 Paper: self.20260413194437.188	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413194437.188 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 19:45	Success	-	View
exp_self.20260413193700.187_20260413_193701 Paper: self.20260413193700.187	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413193700.187 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 19:38	Success	-	View
exp_pytrain.20260413193429.087_20260413_193430 Paper: pytrain.20260413193429.087	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 19:35	Success	-	View
exp_self.20260413193008.186_20260413_193009 Paper: self.20260413193008.186	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413193008.186 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 19:31	Success	-	View
exp_self.20260413192240.185_20260413_192240 Paper: self.20260413192240.185	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413192240.185 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 19:23	Success	-	View
exp_self.20260413191513.184_20260413_191514 Paper: self.20260413191513.184	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413191513.184 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 19:16	Success	-	View
exp_self.20260413190737.183_20260413_190738 Paper: self.20260413190737.183	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413190737.183 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 19:08	Success	-	View
exp_gh_qualcomm_ai-hub-apps_20260413_190448 Paper: gh_qualcomm_ai-hub-apps	qualcomm/ai-hub-apps Paper ID: gh_qualcomm_ai-hub-apps - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 19:05	Success	-	View
exp_pytrain.20260413190232.086_20260413_190232 Paper: pytrain.20260413190232.086	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 19:03	Success	-	View
exp_self.20260413185538.182_20260413_185538 Paper: self.20260413185538.182	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413185538.182 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 18:56	Success	-	View
exp_self.20260413184807.181_20260413_184807 Paper: self.20260413184807.181	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413184807.181 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 18:49	Success	-	View
exp_self.20260413184035.180_20260413_184035 Paper: self.20260413184035.180	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413184035.180 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 18:41	Success	-	View
exp_self.20260413183306.179_20260413_183307 Paper: self.20260413183306.179	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413183306.179 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 18:34	Success	-	View
exp_pytrain.20260413183033.085_20260413_183033 Paper: pytrain.20260413183033.085	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 18:31	Success	-	View
exp_self.20260413182453.178_20260413_182454 Paper: self.20260413182453.178	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413182453.178 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 18:25	Success	-	View
exp_self.20260413181723.177_20260413_181723 Paper: self.20260413181723.177	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413181723.177 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 18:18	Success	-	View
exp_self.20260413180947.176_20260413_180947 Paper: self.20260413180947.176	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413180947.176 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 18:10	Success	-	View
exp_self.20260413180241.175_20260413_180241 Paper: self.20260413180241.175	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413180241.175 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 18:03	Success	-	View
exp_pytrain.20260413175904.084_20260413_175904 Paper: pytrain.20260413175904.084	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 18:00	Success	-	View
exp_self.20260413175335.174_20260413_175336 Paper: self.20260413175335.174	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413175335.174 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 17:54	Success	-	View
exp_self.20260413174608.173_20260413_174609 Paper: self.20260413174608.173	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413174608.173 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 17:47	Success	-	View
exp_self.20260413173840.172_20260413_173840 Paper: self.20260413173840.172	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413173840.172 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 17:39	Success	-	View
exp_hf_2604.02315_20260413_173519 Paper: hf_2604.02315	Beyond the Assistant Turn: User Turn Generation as a Probe of Interaction Awareness in Language Models Paper ID: hf_2604.02315 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 17:36	Success	-	View
exp_self.20260413172951.171_20260413_172952 Paper: self.20260413172951.171	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413172951.171 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 17:30	Success	-	View
exp_pytrain.20260413172713.083_20260413_172714 Paper: pytrain.20260413172713.083	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 17:28	Success	-	View
exp_self.20260413172016.170_20260413_172016 Paper: self.20260413172016.170	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413172016.170 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 17:21	Success	-	View
exp_self.20260413171240.169_20260413_171240 Paper: self.20260413171240.169	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413171240.169 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 17:13	Success	-	View
exp_self.20260413170513.168_20260413_170513 Paper: self.20260413170513.168	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413170513.168 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 17:06	Success	-	View
exp_self.20260413165741.167_20260413_165741 Paper: self.20260413165741.167	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413165741.167 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 16:58	Success	-	View
exp_pytrain.20260413165501.082_20260413_165501 Paper: pytrain.20260413165501.082	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 16:56	Success	-	View
exp_self.20260413164805.166_20260413_164805 Paper: self.20260413164805.166	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413164805.166 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 16:49	Success	-	View
exp_self.20260413164033.165_20260413_164034 Paper: self.20260413164033.165	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413164033.165 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 16:41	Success	-	View
exp_self.20260413163251.164_20260413_163251 Paper: self.20260413163251.164	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413163251.164 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 16:33	Success	-	View
exp_self.20260413162522.163_20260413_162522 Paper: self.20260413162522.163	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413162522.163 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 16:26	Success	-	View
exp_pytrain.20260413162239.081_20260413_162239 Paper: pytrain.20260413162239.081	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 16:23	Success	-	View
exp_self.20260413161627.162_20260413_161628 Paper: self.20260413161627.162	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413161627.162 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 16:17	Success	-	View
exp_self.20260413160857.161_20260413_160857 Paper: self.20260413160857.161	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413160857.161 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 16:09	Success	-	View
exp_self.20260413160133.160_20260413_160133 Paper: self.20260413160133.160	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413160133.160 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 16:02	Success	-	View
exp_self.20260413155412.159_20260413_155412 Paper: self.20260413155412.159	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413155412.159 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 15:55	Success	-	View
exp_pytrain.20260413155041.080_20260413_155041 Paper: pytrain.20260413155041.080	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 15:51	Success	-	View
exp_self.20260413154634.158_20260413_154634 Paper: self.20260413154634.158	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413154634.158 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 15:47	Success	-	View
exp_self.20260413153914.157_20260413_153914 Paper: self.20260413153914.157	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413153914.157 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 15:40	Success	-	View
exp_self.20260413153157.156_20260413_153157 Paper: self.20260413153157.156	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413153157.156 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 15:33	Success	-	View
exp_self.20260413152441.155_20260413_152441 Paper: self.20260413152441.155	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413152441.155 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 15:25	Success	-	View
exp_pytrain.20260413151858.079_20260413_151858 Paper: pytrain.20260413151858.079	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 15:20	Success	-	View
exp_self.20260413151704.154_20260413_151705 Paper: self.20260413151704.154	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413151704.154 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 15:18	Success	-	View
exp_self.20260413150949.153_20260413_150950 Paper: self.20260413150949.153	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413150949.153 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 15:10	Success	-	View
exp_self.20260413150237.152_20260413_150237 Paper: self.20260413150237.152	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413150237.152 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 15:03	Success	-	View
exp_self.20260413145518.151_20260413_145519 Paper: self.20260413145518.151	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413145518.151 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 14:56	Success	-	View
exp_self.20260413144752.150_20260413_144752 Paper: self.20260413144752.150	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413144752.150 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 14:48	Success	-	View
exp_pytrain.20260413144534.078_20260413_144535 Paper: pytrain.20260413144534.078	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 14:46	Success	-	View
exp_self.20260413143846.149_20260413_143847 Paper: self.20260413143846.149	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413143846.149 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 14:39	Success	-	View
exp_self.20260413143134.148_20260413_143134 Paper: self.20260413143134.148	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413143134.148 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 14:32	Success	-	View
exp_self.20260413142413.147_20260413_142414 Paper: self.20260413142413.147	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413142413.147 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 14:25	Success	-	View
exp_self.20260413141638.146_20260413_141638 Paper: self.20260413141638.146	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413141638.146 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 14:17	Success	-	View
exp_pytrain.20260413141419.077_20260413_141419 Paper: pytrain.20260413141419.077	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 14:15	Success	-	View
exp_self.20260413140731.145_20260413_140731 Paper: self.20260413140731.145	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413140731.145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 14:08	Success	-	View
exp_self.20260413140007.144_20260413_140008 Paper: self.20260413140007.144	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413140007.144 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 14:01	Success	-	View
exp_self.20260413135239.143_20260413_135240 Paper: self.20260413135239.143	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413135239.143 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 13:53	Success	-	View
exp_self.20260413134519.142_20260413_134520 Paper: self.20260413134519.142	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413134519.142 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 13:46	Success	-	View
exp_pytrain.20260413134259.076_20260413_134300 Paper: pytrain.20260413134259.076	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 13:44	Success	-	View
exp_self.20260413133737.141_20260413_133737 Paper: self.20260413133737.141	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413133737.141 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 13:38	Success	-	View
exp_self.20260413133015.140_20260413_133016 Paper: self.20260413133015.140	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413133015.140 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 13:31	Success	-	View
exp_self.20260413132255.139_20260413_132255 Paper: self.20260413132255.139	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413132255.139 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 13:23	Success	-	View
exp_hf_2604.04987_20260413_132006 Paper: hf_2604.04987	Cactus: Accelerating Auto-Regressive Decoding with Constrained Acceptance Speculative Sampling Paper ID: hf_2604.04987 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 13:21	Success	-	View
exp_self.20260413131311.138_20260413_131311 Paper: self.20260413131311.138	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413131311.138 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 13:14	Success	-	View
exp_pytrain.20260413131047.075_20260413_131047 Paper: pytrain.20260413131047.075	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 13:11	Success	-	View
exp_self.20260413130332.137_20260413_130332 Paper: self.20260413130332.137	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413130332.137 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 13:04	Success	-	View
exp_self.20260413125611.136_20260413_125611 Paper: self.20260413125611.136	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413125611.136 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 12:57	Success	-	View
exp_self.20260413124848.135_20260413_124849 Paper: self.20260413124848.135	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413124848.135 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 12:49	Success	-	View
exp_self.20260413124120.134_20260413_124120 Paper: self.20260413124120.134	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413124120.134 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 12:42	Success	-	View
exp_pytrain.20260413123900.074_20260413_123900 Paper: pytrain.20260413123900.074	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 12:40	Success	-	View
exp_self.20260413123157.133_20260413_123158 Paper: self.20260413123157.133	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413123157.133 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 12:33	Success	-	View
exp_self.20260413122432.132_20260413_122432 Paper: self.20260413122432.132	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413122432.132 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 12:25	Success	-	View
exp_self.20260413121709.131_20260413_121710 Paper: self.20260413121709.131	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413121709.131 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 12:18	Success	-	View
exp_self.20260413120923.130_20260413_120923 Paper: self.20260413120923.130	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413120923.130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 12:10	Success	-	View
exp_pytrain.20260413120634.073_20260413_120634 Paper: pytrain.20260413120634.073	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 12:07	Success	-	View
exp_self.20260413115941.129_20260413_115942 Paper: self.20260413115941.129	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413115941.129 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 12:00	Success	-	View
exp_self.20260413115219.128_20260413_115220 Paper: self.20260413115219.128	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413115219.128 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 11:53	Success	-	View
exp_self.20260413114447.127_20260413_114447 Paper: self.20260413114447.127	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413114447.127 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 11:45	Success	-	View
exp_self.20260413113704.126_20260413_113705 Paper: self.20260413113704.126	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413113704.126 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 11:38	Success	-	View
exp_pytrain.20260413113435.072_20260413_113435 Paper: pytrain.20260413113435.072	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 11:35	Success	-	View
exp_self.20260413113014.125_20260413_113015 Paper: self.20260413113014.125	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413113014.125 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 11:31	Success	-	View
exp_self.20260413112251.124_20260413_112252 Paper: self.20260413112251.124	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413112251.124 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 11:23	Success	-	View
exp_self.20260413111516.123_20260413_111517 Paper: self.20260413111516.123	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413111516.123 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 11:16	Success	-	View
exp_self.20260413110742.122_20260413_110742 Paper: self.20260413110742.122	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413110742.122 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 11:08	Success	-	View
exp_hf_2604.09527_20260413_110451 Paper: hf_2604.09527	Envisioning the Future, One Step at a Time Paper ID: hf_2604.09527 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 11:05	Success	-	View
exp_pytrain.20260413110243.071_20260413_110243 Paper: pytrain.20260413110243.071	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 11:03	Success	-	View
exp_self.20260413105711.121_20260413_105712 Paper: self.20260413105711.121	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413105711.121 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 10:58	Success	-	View
exp_self.20260413104939.120_20260413_104940 Paper: self.20260413104939.120	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413104939.120 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 10:50	Success	-	View
exp_self.20260413104213.119_20260413_104214 Paper: self.20260413104213.119	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413104213.119 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 10:43	Success	-	View
exp_hf_2604.09482_20260413_103857 Paper: hf_2604.09482	Process Reward Agents for Steering Knowledge-Intensive Reasoning Paper ID: hf_2604.09482 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 10:39	Success	-	View
exp_self.20260413103335.118_20260413_103335 Paper: self.20260413103335.118	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413103335.118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 10:34	Success	-	View
exp_pytrain.20260413103101.070_20260413_103101 Paper: pytrain.20260413103101.070	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 10:32	Success	-	View
exp_self.20260413102411.117_20260413_102411 Paper: self.20260413102411.117	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413102411.117 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 10:25	Success	-	View
exp_self.20260413101648.116_20260413_101648 Paper: self.20260413101648.116	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413101648.116 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 10:17	Success	-	View
exp_self.20260413100929.115_20260413_100929 Paper: self.20260413100929.115	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413100929.115 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 10:10	Success	-	View
exp_self.20260413100202.114_20260413_100203 Paper: self.20260413100202.114	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413100202.114 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 10:03	Success	-	View
exp_pytrain.20260413095826.069_20260413_095827 Paper: pytrain.20260413095826.069	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 09:59	Success	-	View
exp_self.20260413095417.113_20260413_095417 Paper: self.20260413095417.113	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413095417.113 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 09:55	Success	-	View
exp_self.20260413094654.112_20260413_094654 Paper: self.20260413094654.112	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413094654.112 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 09:47	Success	-	View
exp_self.20260413093927.111_20260413_093927 Paper: self.20260413093927.111	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413093927.111 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 09:40	Success	-	View
exp_hf_2604.09130_20260413_093638 Paper: hf_2604.09130	EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers Paper ID: hf_2604.09130 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 09:37	Success	-	View
exp_self.20260413092935.110_20260413_092936 Paper: self.20260413092935.110	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413092935.110 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 09:30	Success	-	View
exp_pytrain.20260413092708.068_20260413_092708 Paper: pytrain.20260413092708.068	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 09:28	Success	-	View
exp_hf_2604.01848_20260413_092425 Paper: hf_2604.01848	Semantic Richness or Geometric Reasoning? The Fragility of VLM's Visual Invariance Paper ID: hf_2604.01848 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 09:25	Success	-	View
exp_self.20260413092007.109_20260413_092007 Paper: self.20260413092007.109	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413092007.109 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 09:21	Success	-	View
exp_self.20260413091244.108_20260413_091244 Paper: self.20260413091244.108	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413091244.108 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 09:13	Success	-	View
exp_self.20260413090519.107_20260413_090520 Paper: self.20260413090519.107	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413090519.107 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 09:06	Success	-	View
exp_self.20260413085730.106_20260413_085731 Paper: self.20260413085730.106	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413085730.106 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 08:58	Success	-	View
exp_pytrain.20260413085446.067_20260413_085447 Paper: pytrain.20260413085446.067	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 08:55	Success	-	View
exp_self.20260413084746.105_20260413_084746 Paper: self.20260413084746.105	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413084746.105 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 08:48	Success	-	View
exp_self.20260413084025.104_20260413_084025 Paper: self.20260413084025.104	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413084025.104 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 08:41	Success	-	View
exp_self.20260413083237.103_20260413_083238 Paper: self.20260413083237.103	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413083237.103 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 08:33	Success	-	View
exp_self.20260413082502.102_20260413_082502 Paper: self.20260413082502.102	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413082502.102 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 08:26	Success	-	View
exp_pytrain.20260413082233.066_20260413_082234 Paper: pytrain.20260413082233.066	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 08:23	Success	-	View
exp_self.20260413081537.101_20260413_081538 Paper: self.20260413081537.101	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413081537.101 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 08:16	Success	-	View
exp_self.20260413080805.100_20260413_080805 Paper: self.20260413080805.100	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413080805.100 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 08:09	Success	-	View
exp_self.20260413080040.099_20260413_080040 Paper: self.20260413080040.099	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413080040.099 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 08:01	Success	-	View
exp_self.20260413075316.098_20260413_075317 Paper: self.20260413075316.098	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413075316.098 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 07:54	Success	-	View
exp_pytrain.20260413075049.065_20260413_075049 Paper: pytrain.20260413075049.065	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 07:51	Success	-	View
exp_self.20260413074357.097_20260413_074357 Paper: self.20260413074357.097	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413074357.097 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 07:45	Success	-	View
exp_self.20260413073627.096_20260413_073628 Paper: self.20260413073627.096	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413073627.096 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 07:37	Success	-	View
exp_self.20260413072905.095_20260413_072906 Paper: self.20260413072905.095	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413072905.095 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 07:30	Success	-	View
exp_self.20260413072140.094_20260413_072140 Paper: self.20260413072140.094	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413072140.094 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 07:22	Success	-	View
exp_pytrain.20260413071808.064_20260413_071808 Paper: pytrain.20260413071808.064	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 07:19	Success	-	View
exp_self.20260413071354.093_20260413_071354 Paper: self.20260413071354.093	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413071354.093 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 07:14	Success	-	View
exp_cr_10.1145_3800690_20260413_071104 Paper: cr_10.1145_3800690	Enabling Low-Latency, GPU-Efficient Serverless Inference with Model Swapping Paper ID: cr_10.1145_3800690 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered bench...	04-13 07:12	Success	-	View
exp_cr_10.1145_3807449_20260413_070711 Paper: cr_10.1145_3807449	Optimizing Attention for Large Language Model Inference on the MT-3000 Many-Core Processor Paper ID: cr_10.1145_3807449 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered bench...	04-13 07:08	Success	-	View
exp_self.20260413070451.092_20260413_070451 Paper: self.20260413070451.092	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413070451.092 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 07:05	Success	-	View
exp_cr_10.1145_3802593_20260413_070141 Paper: cr_10.1145_3802593	FDSR: Efficient Model Training via Adaptive Tensor Quantization Based on Frequency Domain Division and Similarity Data R... Paper ID: cr_10.1145_3802593 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered bench...	04-13 07:02	Success	-	View
exp_self.20260413065617.091_20260413_065617 Paper: self.20260413065617.091	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413065617.091 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 06:57	Success	-	View
exp_self.20260413064849.090_20260413_064849 Paper: self.20260413064849.090	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413064849.090 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 06:49	Success	-	View
exp_pytrain.20260413064618.063_20260413_064618 Paper: pytrain.20260413064618.063	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 06:47	Success	-	View
exp_self.20260413064201.089_20260413_064202 Paper: self.20260413064201.089	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413064201.089 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 06:43	Success	-	View
exp_self.20260413063429.088_20260413_063429 Paper: self.20260413063429.088	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413063429.088 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 06:35	Success	-	View
exp_self.20260413062659.087_20260413_062700 Paper: self.20260413062659.087	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413062659.087 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 06:28	Success	-	View
exp_self.20260413061935.086_20260413_061935 Paper: self.20260413061935.086	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413061935.086 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 06:20	Success	-	View
exp_pytrain.20260413061446.062_20260413_061447 Paper: pytrain.20260413061446.062	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 06:15	Success	-	View
exp_self.20260413061143.085_20260413_061148 Paper: self.20260413061143.085	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413061143.085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 06:12	Success	-	View
exp_self.20260413060418.084_20260413_060418 Paper: self.20260413060418.084	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413060418.084 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 06:05	Success	-	View
exp_hf_2604.08118_20260413_060058 Paper: hf_2604.08118	Initialisation Determines the Basin: Efficient Codebook Optimisation for Extreme LLM Quantization Paper ID: hf_2604.08118 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 06:02	Success	-	View
exp_self.20260413055536.083_20260413_055537 Paper: self.20260413055536.083	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413055536.083 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 05:56	Success	-	View
exp_hf_2604.08540_20260413_055245 Paper: hf_2604.08540	AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation Paper ID: hf_2604.08540 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 05:53	Success	-	View
exp_self.20260413054522.082_20260413_054523 Paper: self.20260413054522.082	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413054522.082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 05:46	Success	-	View
exp_pytrain.20260413054253.061_20260413_054254 Paper: pytrain.20260413054253.061	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 05:43	Success	-	View
exp_self.20260413053607.081_20260413_053607 Paper: self.20260413053607.081	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413053607.081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 05:37	Success	-	View
exp_self.20260413052831.080_20260413_052831 Paper: self.20260413052831.080	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413052831.080 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 05:29	Success	-	View
exp_self.20260413052103.079_20260413_052103 Paper: self.20260413052103.079	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413052103.079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 05:22	Success	-	View
exp_self.20260413051341.078_20260413_051342 Paper: self.20260413051341.078	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413051341.078 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 05:14	Success	-	View
exp_pytrain.20260413051112.060_20260413_051112 Paper: pytrain.20260413051112.060	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 05:12	Success	-	View
exp_self.20260413050416.077_20260413_050416 Paper: self.20260413050416.077	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413050416.077 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 05:05	Success	-	View
exp_self.20260413045638.076_20260413_045638 Paper: self.20260413045638.076	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413045638.076 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 04:57	Success	-	View
exp_self.20260413044911.075_20260413_044911 Paper: self.20260413044911.075	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413044911.075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 04:50	Success	-	View
exp_self.20260413044147.074_20260413_044147 Paper: self.20260413044147.074	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413044147.074 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 04:42	Success	-	View
exp_pytrain.20260413043901.059_20260413_043901 Paper: pytrain.20260413043901.059	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 04:40	Success	-	View
exp_self.20260413043202.073_20260413_043203 Paper: self.20260413043202.073	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413043202.073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 04:33	Success	-	View
exp_hf_2604.04415_20260413_042844 Paper: hf_2604.04415	Structured Causal Video Reasoning via Multi-Objective Alignment Paper ID: hf_2604.04415 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 04:29	Success	-	View
exp_self.20260413042432.072_20260413_042433 Paper: self.20260413042432.072	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413042432.072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 04:25	Success	-	View
exp_self.20260413041707.071_20260413_041707 Paper: self.20260413041707.071	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413041707.071 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 04:18	Success	-	View
exp_self.20260413040943.070_20260413_040944 Paper: self.20260413040943.070	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413040943.070 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 04:10	Success	-	View
exp_pytrain.20260413040716.058_20260413_040716 Paper: pytrain.20260413040716.058	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 04:08	Success	-	View
exp_self.20260413040018.069_20260413_040018 Paper: self.20260413040018.069	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413040018.069 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 04:01	Success	-	View
exp_cr_10.3390_rs18081145_20260413_035558 Paper: cr_10.3390_rs18081145	Dynamic Expansion Mixture-of-Experts with Pre-Trained Vision Transformer for Few-Shot Class-Incremental Remote Sensing S... Paper ID: cr_10.3390_rs18081145 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered be...	04-13 03:57	Success	-	View
exp_self.20260413035256.068_20260413_035257 Paper: self.20260413035256.068	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413035256.068 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 03:53	Success	-	View
exp_self.20260413034524.067_20260413_034525 Paper: self.20260413034524.067	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413034524.067 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 03:46	Success	-	View
exp_self.20260413033746.066_20260413_033746 Paper: self.20260413033746.066	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413033746.066 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 03:38	Success	-	View
exp_pytrain.20260413033522.057_20260413_033522 Paper: pytrain.20260413033522.057	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 03:36	Success	-	View
exp_self.20260413033107.065_20260413_033107 Paper: self.20260413033107.065	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413033107.065 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 03:32	Success	-	View
exp_self.20260413032334.064_20260413_032335 Paper: self.20260413032334.064	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413032334.064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 03:24	Success	-	View
exp_self.20260413031558.063_20260413_031558 Paper: self.20260413031558.063	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413031558.063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 03:17	Success	-	View
exp_self.20260413030812.062_20260413_030813 Paper: self.20260413030812.062	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413030812.062 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 03:09	Success	-	View
exp_pytrain.20260413030335.056_20260413_030335 Paper: pytrain.20260413030335.056	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 03:04	Success	-	View
exp_self.20260413030033.061_20260413_030033 Paper: self.20260413030033.061	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413030033.061 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 03:01	Success	-	View
exp_self.20260413025302.060_20260413_025302 Paper: self.20260413025302.060	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413025302.060 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 02:54	Success	-	View
exp_self.20260413024538.059_20260413_024538 Paper: self.20260413024538.059	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413024538.059 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 02:46	Success	-	View
exp_self.20260413023805.058_20260413_023806 Paper: self.20260413023805.058	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413023805.058 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 02:39	Success	-	View
exp_pytrain.20260413023150.055_20260413_023150 Paper: pytrain.20260413023150.055	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 02:32	Success	-	View
exp_self.20260413022957.057_20260413_022957 Paper: self.20260413022957.057	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413022957.057 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 02:30	Success	-	View
exp_self.20260413022231.056_20260413_022231 Paper: self.20260413022231.056	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413022231.056 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 02:23	Success	-	View
exp_self.20260413021500.055_20260413_021501 Paper: self.20260413021500.055	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413021500.055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 02:16	Success	-	View
exp_self.20260413020711.054_20260413_020711 Paper: self.20260413020711.054	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413020711.054 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 02:08	Success	-	View
exp_cr_10.54254_2755-2721_2026.ba32663_20260413_020400 Paper: cr_10.54254_2755-2721_2026.ba32663	Comparative Study of LSTM, Transformer, and Mixture of Experts for RUL Prediction with Regime-Aware Optimization Researc... Paper ID: cr_10.54254_2755-2721_2026.ba32663 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal:...	04-13 02:05	Success	-	View
exp_pytrain.20260413015938.054_20260413_015939 Paper: pytrain.20260413015938.054	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 02:00	Success	-	View
exp_self.20260413015745.053_20260413_015746 Paper: self.20260413015745.053	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413015745.053 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 01:58	Success	-	View
exp_self.20260413015018.052_20260413_015018 Paper: self.20260413015018.052	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413015018.052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 01:51	Success	-	View
exp_self.20260413014253.051_20260413_014254 Paper: self.20260413014253.051	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413014253.051 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 01:43	Success	-	View
exp_self.20260413013527.050_20260413_013527 Paper: self.20260413013527.050	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413013527.050 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 01:36	Success	-	View
exp_hf_2604.08626_20260413_013207 Paper: hf_2604.08626	WildDet3D: Scaling Promptable 3D Detection in the Wild Paper ID: hf_2604.08626 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 01:33	Success	-	View
exp_pytrain.20260413012749.053_20260413_012750 Paper: pytrain.20260413012749.053	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 01:28	Success	-	View
exp_self.20260413012556.049_20260413_012556 Paper: self.20260413012556.049	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413012556.049 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 01:26	Success	-	View
exp_self.20260413011829.048_20260413_011829 Paper: self.20260413011829.048	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413011829.048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 01:19	Success	-	View
exp_hf_2604.07786_20260413_011510 Paper: hf_2604.07786	Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video Paper ID: hf_2604.07786 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 01:16	Success	-	View
exp_2604.09547v1_20260413_011251 Paper: 2604.09547v1	Tango: Taming Visual Signals for Efficient Video Large Language Models Paper ID: 2604.09547v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-13 01:13	Success	-	View
exp_cr_10.38124_ijisrt_26apr247_20260413_011001 Paper: cr_10.38124_ijisrt_26apr247	Leveraging Gemma 4 Large Language Model for Protein Function Prediction and Interpretability Application of AI Models fo... Paper ID: cr_10.38124_ijisrt_26apr247 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recove...	04-13 01:11	Success	-	View
exp_hf_2604.09450_20260413_010740 Paper: hf_2604.09450	ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion Paper ID: hf_2604.09450 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 01:08	Success	-	View
exp_self.20260413010542.047_20260413_010543 Paper: self.20260413010542.047	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260413010542.047 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-13 01:06	Success	-	View
exp_hf_2604.08995_20260413_010246 Paper: hf_2604.08995	Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Paper ID: hf_2604.08995 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-13 01:03	Success	-	View
exp_pytrain.20260413005442.052_20260413_005542 Paper: pytrain.20260413005442.052	Python Skill Fallback Title: Strictly Typed Event Dispatcher Library - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-13 00:56	Success	-	View
exp_self.20260413002630.046_20260413_002654 Paper: self.20260413002630.046	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput under strict 8GB VRAM constraints compared to standard sequence processing. Methodology We compare two dist...	04-13 00:28	Success	-	View
exp_pytrain.20260412232411.051_20260412_232449 Paper: pytrain.20260412232411.051	Python Skill Fallback Title: Type-Aware CLI Argument Binder - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-12 23:25	Success	-	View
exp_self.20260412230034.045_20260412_230100 Paper: self.20260412230034.045	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260412230034.045 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-12 23:02	Success	-	View
exp_pytrain.20260412215617.050_20260412_215715 Paper: pytrain.20260412215617.050	Dynamic Type-Verified Plugin Loader This benchmark validates a robust plugin architecture implementation based on `typing.Protocol` and `importlib`. It simulates an autonomous system that receives raw code artifacts, dynamically packages them into a runtime module, and enforc...	04-12 21:58	Success	-	View
exp_gh_piroplayers69-ops_S3T-Former_20260412_214229 Paper: gh_piroplayers69-ops_S3T-Former	piroplayers69-ops/S3T-Former Paper ID: gh_piroplayers69-ops_S3T-Former - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Re...	04-12 21:43	Success	-	View
exp_pytrain.20260412211950.049_20260412_212018 Paper: pytrain.20260412211950.049	Generic Type-Safe CLI Command Builder This benchmark evaluates the design and implementation of a robust, type-safe command-line interface (CLI) framework using Python's standard library. Problem Statement The goal is to construct a `cli_builder` framework that enforces strong...	04-12 21:21	Success	-	View
exp_self.20260412205759.044_20260412_205827 Paper: self.20260412205759.044	Self-directed SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the efficiency of State Space Models (SSM) versus standard Attention-based architectures under strict memory constraints. The "Innovation" is the utilization of an SSM strategy (mimicking Mamba-style select...	04-12 20:59	Success	-	View
exp_pytrain.20260412201149.048_20260412_201207 Paper: pytrain.20260412201149.048	Generic Dependency Container with CLI Entry Point This coding drill benchmarks the implementation of a dependency injection container using Python 3.12's modern Type Parameter Syntax (PEP 695). It enforces a strict separation of concerns, treating the logic as a reusable library and the `m...	04-12 20:13	Success	-	View
exp_self.20260412195156.043_20260412_195217 Paper: self.20260412195156.043	Self-directed SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy improves throughput under constrained 8GB VRAM environments. It compares a Baseline (naive memory handling) agains...	04-12 19:53	Success	-	View
exp_pytrain.20260412190615.047_20260412_190640 Paper: pytrain.20260412190615.047	Strictly Typed Component Registry Benchmark This benchmark evaluates the implementation of a strictly typed component registry system using Python's `typing.Protocol` (PEP 544) to enforce structural subtyping. It simulates a modular architecture for performing operations on tensor-li...	04-12 19:07	Success	-	View
exp_self.20260412184637.042_20260412_184656 Paper: self.20260412184637.042	Self-directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the performance of a disciplined State Space Model (SSM) implementation against a baseline approach under strict memory constraints (simulating an 8GB VRAM limit). Hypothesis Applying an SSM with a disciplined memor...	04-12 18:48	Success	-	View
exp_pytrain.20260412175234.046_20260412_175259 Paper: pytrain.20260412175234.046	Python Skill Fallback Title: Dynamic Plugin Loader with Runtime Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-12 17:54	Success	-	View
exp_self.20260412173053.041_20260412_173114 Paper: self.20260412173053.041	SSM Strategy Stress Test This benchmark evaluates the hypothesis that State Space Models (SSMs), specifically the Mamba architecture, provide higher inference throughput and better VRAM utilization under 8GB constraints compared to traditional Transformer-based mod...	04-12 17:32	Success	-	View
exp_pytrain.20260412164417.045_20260412_164437 Paper: pytrain.20260412164417.045	Dynamic CLI Plugin System Benchmark This benchmark tests your ability to implement a robust, type-safe plugin architecture using Python's standard library. You will define a Protocol for interface enforcement, a Registry for dependency management, and use `importlib` to dynam...	04-12 16:45	Success	-	View
exp_self.20260412162331.040_20260412_162357 Paper: self.20260412162331.040	Self-directed benchmark: SSM Strategy Stress Test This repository contains a minimal, runnable benchmark designed to test the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput under constrained 8GB VRAM environments. Objective To compar...	04-12 16:25	Success	-	View
exp_pytrain.20260412153650.044_20260412_153715 Paper: pytrain.20260412153650.044	Python Skill Fallback Title: Strict Package Metadata Validator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-12 15:38	Success	-	View
exp_self.20260412151622.039_20260412_151650 Paper: self.20260412151622.039	SSM Strategy Stress Test: Memory Policy & Precision This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (chunked processing, state retention, and mixed precision) significantly improves throughput and reduces VRAM usage compared to...	04-12 15:18	Success	-	View
exp_pytrain.20260412143046.043_20260412_143104 Paper: pytrain.20260412143046.043	Python Skill Fallback Title: Dynamic Plugin Loader with Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-12 14:32	Success	-	View
exp_self.20260412141123.038_20260412_141148 Paper: self.20260412141123.038	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260412141123.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-12 14:12	Success	-	View
exp_pytrain.20260412132229.042_20260412_132256 Paper: pytrain.20260412132229.042	Python Skill Fallback Title: Generic Package Loader with PEP 695 - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-12 13:23	Success	-	View
exp_self.20260412130035.037_20260412_130119 Paper: self.20260412130035.037	SSM Strategy Stress Test This benchmark evaluates the "Self-directed benchmark: ssm strategy stress test" hypothesis, specifically testing whether a disciplined memory policy (specifically `dynamic_precision` scaling) applied to SSM architectures (Mamba) improves t...	04-12 13:02	Success	-	View
exp_pytrain.20260412120400.041_20260412_120500 Paper: pytrain.20260412120400.041	Python Skill Fallback Title: Strict Dynamic Plugin Loader with Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-12 12:06	Success	-	View
exp_gh_JacobHuang91_prompt-refiner_20260412_115035 Paper: gh_JacobHuang91_prompt-refiner	Benchmark for JacobHuang91/prompt-refiner This benchmark evaluates the performance of the `prompt-refiner` library, focusing on its ability to manage context windows and optimize token usage for LLM applications. Overview The `prompt-refiner` library claims to save 10-20% on API co...	04-12 11:51	Success	-	View
exp_pytrain.20260412112501.040_20260412_112538 Paper: pytrain.20260412112501.040	Typed Plugin Registry with Semantic Versioning Overview This benchmark implements a high-performance, type-safe plugin registry system simulating a modern AI package manager. It utilizes advanced Python `typing` features (Generics, Protocols, TypeVars) and `dataclasses` to manage data t...	04-12 11:26	Success	-	View
exp_self.20260412110044.036_20260412_110124 Paper: self.20260412110044.036	Self-directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (specifically leveraging dynamic precision and cache management) improves throughput under constrained memory (simulated 8GB li...	04-12 11:02	Success	-	View
exp_pytrain.20260412100230.039_20260412_100317 Paper: pytrain.20260412100230.039	Generic Auto-Registry with Dynamic Module Loading This coding drill focuses on advanced Python `typing` and dynamic module loading mechanisms, commonly found in frameworks like Hugging Face Transformers. The benchmark constructs a self-contained environment where a virtual package is gener...	04-12 10:04	Success	-	View
exp_self.20260412093503.035_20260412_093541 Paper: self.20260412093503.035	Small, Runnable Benchmark: SSM Strategy Stress Test This benchmark is designed to test the hypothesis that applying SSM (State Space Models) with a disciplined memory policy improves throughput under 8GB constraints. README.md SSM Strategy Stress Test Benchmark Overview This benchmark ev...	04-12 09:38	Success	-	View
exp_pytrain.20260412083845.038_20260412_083915 Paper: pytrain.20260412083845.038	Python Skill Fallback Title: Typed Plugin Architecture Simulator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-12 08:40	Success	-	View
exp_self.20260412081456.034_20260412_081524 Paper: self.20260412081456.034	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260412081456.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-12 08:16	Success	-	View
exp_pytrain.20260412071638.037_20260412_071659 Paper: pytrain.20260412071638.037	Dynamic Protocol-Based Extension Loader Overview This benchmark evaluates a Python system's capability to enforce strict structural typing using `typing.Protocol` while dynamically discovering and loading logic using `importlib`. Hypothesis An autonomous coding system can create...	04-12 07:18	Success	-	View
exp_self.20260412065209.033_20260412_065239 Paper: self.20260412065209.033	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260412065209.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-12 06:53	Success	-	View
exp_pytrain.20260412055546.036_20260412_055604 Paper: pytrain.20260412055546.036	Python Skill Fallback Title: Typed Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-12 05:57	Success	-	View
exp_self.20260412053141.032_20260412_053212 Paper: self.20260412053141.032	SSM Strategy Stress Test Benchmark This repository contains a minimal, runnable benchmark designed to test the hypothesis that a disciplined memory policy (Dynamic Precision + Selective Caching) applied to State Space Model (SSM) layers improves throughput under constrained...	04-12 05:33	Success	-	View
exp_pytrain.20260412044037.035_20260412_044106 Paper: pytrain.20260412044037.035	Python Skill Fallback Title: Strictly Typed Plugin Registry with Logical Namespacing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-12 04:42	Success	-	View
exp_self.20260412042010.031_20260412_042032 Paper: self.20260412042010.031	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy improves throughput under strict 8GB VRAM constraints. Concept The benchmark compares two approaches to processing...	04-12 04:21	Success	-	View
exp_pytrain.20260412032915.034_20260412_032936 Paper: pytrain.20260412032915.034	Runtime-Validated Plugin Registry This coding drill evaluates the ability to design a robust plugin system using Python's standard library. The candidate must implement an `ExtensionLoader` that dynamically discovers, loads, and validates external Python modules against str...	04-12 03:30	Success	-	View
exp_self.20260412025342.030_20260412_025422 Paper: self.20260412025342.030	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (dynamic precision and memory-efficient scanning) improves throughput compared to a naive float32 implementation under tight 8G...	04-12 03:05	Success	-	View
exp_pytrain.20260412015407.033_20260412_015433 Paper: pytrain.20260412015407.033	Generic Backend Registry with Protocol Enforcement Objective: Design and implement a modular inference engine simulation that strictly decouples interface definitions from concrete implementations. The solution must leverage Python's `typing.Protocol`, `TypeVar`, and Generic programming...	04-12 01:55	Success	-	View
exp_self.20260412013115.029_20260412_013141 Paper: self.20260412013115.029	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying Selective State Space Models (SSM) with a disciplined memory policy (dynamic precision) improves throughput under strict VRAM constraints compared to standard attention mechanisms. Conte...	04-12 01:33	Success	-	View
exp_pytrain.20260412004008.032_20260412_004035 Paper: pytrain.20260412004008.032	Strictly-Typed Component Registry and Dynamic Namespace Loader Benchmark This benchmark evaluates the ability to architect internal SDK structures similar to large-scale libraries like HuggingFace Transformers. It tests the implementation of a robust registry pattern, Protocol enforcement, and dynamic namespace...	04-12 00:41	Success	-	View
exp_self.20260412001746.028_20260412_001807 Paper: self.20260412001746.028	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies—specifically disciplined memory policies and dynamic precision—improves throughput under constrained VRAM (8GB) environments. Methodology We compare tw...	04-12 00:19	Success	-	View
exp_pytrain.20260411232628.031_20260411_232657 Paper: pytrain.20260411232628.031	Runtime-Type-Checked Plugin Registry This coding drill implements a modular Plugin Manager system leveraging Python's `typing.Protocol` for structural subtyping and runtime validation. Unlike traditional inheritance-based architectures, this system enforces contracts via type...	04-11 23:28	Success	-	View
exp_self.20260411230611.027_20260411_230646 Paper: self.20260411230611.027	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260411230611.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-11 23:07	Success	-	View
exp_pytrain.20260411221202.030_20260411_221230 Paper: pytrain.20260411221202.030	Dynamic Async Plugin System Loader Overview This benchmark tests your ability to design a robust runtime code loading system using Python's standard library. It focuses on dynamic packaging, strict type enforcement using `typing.Protocol`, and asynchronous execution handling...	04-11 22:13	Success	-	View
exp_self.20260411215054.026_20260411_215114 Paper: self.20260411215054.026	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260411215054.026 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-11 21:52	Success	-	View
exp_pytrain.20260411210019.029_20260411_210047 Paper: pytrain.20260411210019.029	Dynamic Virtual Package Loader with Strict Protocol Enforcement Overview This benchmark tests your ability to manipulate Python's import system and enforce type safety using modern typing protocols. Scenario: You are building a plugin system where modules are generated dynamically at runtime (e.g.,...	04-11 21:01	Success	-	View
exp_self.20260411204051.025_20260411_204128 Paper: self.20260411204051.025	Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the efficiency of State Space Models (SSM) versus standard Transformer architectures under constrained VRAM conditions (8GB limit). It specifically tests the hypothesis that an SSM implementation with a dis...	04-11 20:42	Success	-	View
exp_pytrain.20260411200250.028_20260411_200317 Paper: pytrain.20260411200250.028	Benchmark: Generic Entry-Point Plugin Loader Overview This benchmark evaluates the implementation of a type-safe, generic plugin loading mechanism. It tests the candidate's ability to combine Python's static type safety features (Generics, Protocols) with dynamic runtime introspection...	04-11 20:04	Success	-	View
exp_self.20260411194619.024_20260411_194631 Paper: self.20260411194619.024	Self-directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying a State Space Model (SSM) approach with a disciplined memory policy improves throughput and reduces VRAM usage compared to a standard baseline implementation. Objective To simulate the m...	04-11 19:47	Success	-	View
exp_pytrain.20260411190821.027_20260411_190845 Paper: pytrain.20260411190821.027	Python Reliability Drill: Typing & Verification This drill implements a mock inference engine using strict Python typing and standard library tools. It simulates tensor operations and memory allocation patterns typical in LLM workloads (referenced from PyTorch and LitGPT contexts) withou...	04-11 19:09	Success	-	View
exp_self.20260411185202.023_20260411_185228 Paper: self.20260411185202.023	Self-directed benchmark: ssm strategy stress test Overview This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput under constrained memory environments (approx 8GB VRAM). It compares a standard Transformer block...	04-11 18:53	Success	-	View
exp_pytrain.20260411181400.026_20260411_181430 Paper: pytrain.20260411181400.026	Benchmark: Strict Backend Registry with PEP 440 Versioning This benchmark evaluates the implementation of a robust `PluginRegistry` system typical in high-performance ML inference engines (like vLLM or Diffusers). Objective Candidates must implement a registry system using Python's standard library...	04-11 18:15	Success	-	View
exp_self.20260411175636.022_20260411_175657 Paper: self.20260411175636.022	Self-directed benchmark: ssm strategy stress test Objective This benchmark evaluates the efficacy of a disciplined memory management policy for State Space Models (specifically mimicking Mamba-style SSMs) under a strict 8GB VRAM constraint. Hypothesis Applying SSM operations with a discipl...	04-11 17:58	Success	-	View
exp_pytrain.20260411171609.025_20260411_171633 Paper: pytrain.20260411171609.025	Python Skill Fallback Title: Strict Typed Module Interface and CLI Entry Point - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-11 17:17	Success	-	View
exp_self.20260411165411.021_20260411_165447 Paper: self.20260411165411.021	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the memory efficiency of State Space Models (SSM) compared to standard Transformer Attention mechanisms under high-sequence-length stress tests. Hypothesis Applying SSM with a disciplined memory policy improves thro...	04-11 16:57	Success	-	View
exp_pytrain.20260411160747.024_20260411_160808 Paper: pytrain.20260411160747.024	Dynamic Plugin Loader with Protocol Enforcement This benchmark tests your ability to use Python's standard library to perform dynamic code generation, filesystem manipulation, and runtime type verification. Objective Create a Python script that programmatically defines a strict `Protocol...	04-11 16:09	Success	-	View
exp_self.20260411153809.020_20260411_153832 Paper: self.20260411153809.020	Self-directed SSM Strategy Stress Test Overview This benchmark validates the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies improves throughput and efficiency under strict 8GB VRAM constraints. It compares a Baseline approach (simula...	04-11 15:49	Success	-	View
exp_pytrain.20260411145202.023_20260411_145220 Paper: pytrain.20260411145202.023	Structural Subtyping and Dynamic Module Loading Benchmark This benchmark tests the ability to combine static structural typing (`typing.Protocol`) with dynamic module introspection (`importlib`). The objective is to build a robust, minimalistic plugin architecture that allows an autonomous system...	04-11 14:53	Success	-	View
exp_self.20260411143309.019_20260411_143346 Paper: self.20260411143309.019	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy (specifically the Mamba architecture) significantly improves throughput and reduces VRAM overhead compared to stand...	04-11 14:34	Success	-	View
exp_pytrain.20260411134749.022_20260411_134812 Paper: pytrain.20260411134749.022	Python 3.12 Type Parameter Syntax Benchmark Objective This benchmark evaluates the runtime behavior and validity of Python 3.12's PEP 695 Type Parameter Syntax within a dynamic package generation scenario. It simulates a meta-build system that generates source code on-the-fly to veri...	04-11 13:49	Success	-	View
exp_self.20260411131521.018_20260411_131603 Paper: self.20260411131521.018	SSM Strategy Stress Test: Memory vs. Throughput This benchmark evaluates the State Space Model (SSM) innovation regarding memory efficiency. The core hypothesis is that an SSM-based architecture with a disciplined memory policy can maintain high throughput (tokens/sec) while drastica...	04-11 13:29	Success	-	View
exp_pytrain.20260411122601.021_20260411_122632 Paper: pytrain.20260411122601.021	Python Skill Fallback Title: Strictly Typed Dependency Injection Container - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-11 12:27	Success	-	View
exp_self.20260411120227.017_20260411_120259 Paper: self.20260411120227.017	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260411120227.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-11 12:04	Success	-	View
exp_pytrain.20260411111143.020_20260411_111217 Paper: pytrain.20260411111143.020	Python Skill Fallback Title: Strictly Typed Artifact Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-11 11:13	Success	-	View
exp_self.20260411104959.016_20260411_105034 Paper: self.20260411104959.016	SSM Strategy Stress Test This repository contains a lightweight, runnable benchmark designed to test the hypothesis that applying SSM (State Space Model) strategies with a disciplined memory policy improves throughput under 8GB VRAM constraints. Hypothesis Stan...	04-11 10:52	Success	-	View
exp_pytrain.20260411100253.019_20260411_100312 Paper: pytrain.20260411100253.019	Strictly Typed Dynamic Plugin Loader Overview This benchmark evaluates the system's ability to simulate the packaging and dynamic loading patterns common in modern ML libraries (e.g., HuggingFace Transformers). It programmatically generates a Python package structure at runtim...	04-11 10:04	Success	-	View
exp_cr_10.1007_s44443-026-00723-5_20260411_095100 Paper: cr_10.1007_s44443-026-00723-5	TM-RAG: a transformer-mamba model for long-text evidence aggregation in retrieval-augmented generation Paper ID: cr_10.1007_s44443-026-00723-5 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-11 09:52	Success	-	View
exp_pytrain.20260411093037.018_20260411_093107 Paper: pytrain.20260411093037.018	Python Skill Fallback Title: Type-Safe Plugin Discovery using importlib - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-11 09:32	Success	-	View
exp_self.20260411090958.015_20260411_091020 Paper: self.20260411090958.015	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that State Space Models (SSMs), employing disciplined memory policies (constant state size), offer superior throughput compared to standard Attention mechanisms under strict VRAM constraints (8GB). Me...	04-11 09:11	Success	-	View
exp_pytrain.20260411082224.017_20260411_082242 Paper: pytrain.20260411082224.017	Robust Package Scaffolder Benchmark This benchmark tests the ability to generate a Python project structure using strict type definitions (`TypedDict`, `NewType`, `Literal`), `argparse` for CLI interaction, and `pathlib` for file system operations. Usage Run the script direct...	04-11 08:23	Success	-	View
exp_self.20260411080236.014_20260411_080303 Paper: self.20260411080236.014	--- Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy (specifically dynamic precision and optimized caching strategies) improves thr...	04-11 08:04	Success	-	View
exp_pytrain.20260411071439.016_20260411_071505 Paper: pytrain.20260411071439.016	Strictly-Typed ZipApp Constructor This benchmark evaluates a Python environment's ability to perform a micro-packaging pipeline that strictly adheres to typing protocols. Objective The goal is to dynamically generate a standalone Python application archive (`.pyz`) that imp...	04-11 07:16	Success	-	View
exp_oa_W7152933450_20260411_070235 Paper: oa_W7152933450	BOSCH: Black-Box Binary Optimization for Short-Context Attention-Head Selection in LLMs Paper ID: oa_W7152933450 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-11 07:03	Success	-	View
exp_pytrain.20260411064112.015_20260411_064135 Paper: pytrain.20260411064112.015	Type-Safe Generic Event Dispatcher Benchmark This project implements a Type-Safe Generic Event Dispatcher using modern Python 3.12+ features, specifically PEP 695 (Type Parameter Syntax) and PEP 544 (Protocols). It serves as a coding drill to verify static type safety constructs and r...	04-11 06:42	Success	-	View
exp_self.20260411062140.013_20260411_062159 Paper: self.20260411062140.013	SSM Strategy Stress Test This repository contains a minimal benchmark designed to evaluate the efficiency of State Space Models (SSMs) versus standard recurrent accumulation when dealing with long sequence dependencies under strict memory constraints. Objective The...	04-11 06:23	Success	-	View
exp_pytrain.20260411053337.014_20260411_053401 Paper: pytrain.20260411053337.014	Python Skill Fallback Title: Strict PyProject Metadata Validator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-11 05:35	Success	-	View
exp_self.20260411051126.012_20260411_051147 Paper: self.20260411051126.012	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying a State Space Model (SSM) strategy with a disciplined memory policy (specifically: chunked inference with state caching and dynamic precision) improves inference throughput under strict...	04-11 05:13	Success	-	View
exp_pytrain.20260411041837.013_20260411_041912 Paper: pytrain.20260411041837.013	Typing-Safe Dynamic Plugin Loader This benchmark tests the ability to construct a robust, dynamic class loading mechanism using `importlib` and `typing.Protocol`. The goal is to simulate a modular architecture where classes are loaded at runtime based on string identifiers...	04-11 04:20	Success	-	View
exp_self.20260411035703.011_20260411_035730 Paper: self.20260411035703.011	SSM Strategy Stress Test This benchmark evaluates the performance of State Space Models (specifically Mamba) under strict VRAM constraints. It contrasts a Standard Baseline against a Precision-Optimized variant to verify the hypothesis that disciplined memo...	04-11 03:58	Success	-	View
exp_pytrain.20260411030229.012_20260411_030249 Paper: pytrain.20260411030229.012	Dynamic Component Loader with Strict Protocol Validation This benchmark evaluates the implementation of a robust, ML-style plugin architecture using Python's standard library. The design simulates a Model Registration system where "plugin" modules are loaded dynamically from memory without touchi...	04-11 03:03	Success	-	View
exp_self.20260411024146.010_20260411_024215 Paper: self.20260411024146.010	SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies improves throughput under 8GB VRAM constraints compared to standard Transformer architectures. It compares tw...	04-11 02:43	Success	-	View
exp_pytrain.20260411015154.011_20260411_015227 Paper: pytrain.20260411015154.011	Python Skill Fallback Title: Strictly Typed Configuration & CLI Entry Point - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-11 01:53	Success	-	View
exp_self.20260411013004.009_20260411_013045 Paper: self.20260411013004.009	SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy significantly improves inference throughput and reduces VRAM overhead compared to standard attention mechanisms when h...	04-11 01:32	Success	-	View
exp_pytrain.20260411004055.010_20260411_004120 Paper: pytrain.20260411004055.010	Python Skill Fallback Title: Strictly-Typed Dependency Visualizer - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-11 00:42	Success	-	View
exp_self.20260411002149.008_20260411_002206 Paper: self.20260411002149.008	Self-directed benchmark: ssm strategy stress test This benchmark evaluates the hypothesis that a disciplined memory policy within an SSM (State Space Model) architecture improves throughput under strict 8GB VRAM constraints. We compare a Baseline (Standard Transformer Attention mechani...	04-11 00:23	Success	-	View
exp_pytrain.20260410233809.009_20260410_233839 Paper: pytrain.20260410233809.009	Dynamic Plugin System with Runtime Type Verification This benchmark tests the ability to design a modular, type-safe plugin system using Python's standard library. It evaluates the candidate's proficiency with `typing.Protocol` for interface definition, `importlib` for dynamic module loading,...	04-10 23:39	Success	-	View
exp_self.20260410231535.007_20260410_231609 Paper: self.20260410231535.007	SSM Strategy Stress Test This benchmark evaluates the performance characteristics of a State Space Model (SSM) implementation under memory pressure. It compares a naive, full-sequence processing approach against a disciplined memory policy that utilizes chunked sca...	04-10 23:17	Success	-	View
exp_pytrain.20260410222110.008_20260410_222140 Paper: pytrain.20260410222110.008	Self-Validating Plugin Registry with Dynamic Imports Overview This benchmark evaluates a Python system's capability to dynamically construct, load, and validate software modules without relying on external files. It tests the integration of `importlib` for runtime module management and `typin...	04-10 22:22	Success	-	View
exp_gh_onehundredfifty-myelatelia678_streaminfer_20260410_220818 Paper: gh_onehundredfifty-myelatelia678_streaminfer	Benchmark: Streaming Inference with Adaptive Batching This benchmark evaluates the performance of a streaming inference engine. It simulates a real-time workload where input requests arrive continuously. The engine implements adaptive batching (grouping requests to maximize throughput) and bac...	04-10 22:09	Success	-	View
exp_pytrain.20260410214603.007_20260410_214646 Paper: pytrain.20260410214603.007	Type-Safe Tensor Arithmetic Package Benchmark Objective Design and implement a robust Python package named `tensor_lite` that performs basic 2D matrix operations. The solution must demonstrate proficiency in modern Python packaging, static typing using Generics and Protocols, and basic...	04-10 21:47	Success	-	View
exp_self.20260410212356.006_20260410_212427 Paper: self.20260410212356.006	Self-Directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Model (SSM) logic with a disciplined memory policy (dynamic precision and strict state management) improves inference throughput under constrained VRAM (8GB) compared to stan...	04-10 21:25	Success	-	View
exp_pytrain.20260410202331.006_20260410_202400 Paper: pytrain.20260410202331.006	Typed Plugin Registry and Namespace Dispatcher Overview This benchmark demonstrates a robust, modular architecture using Python's standard `typing` module. It simulates a multi-package ecosystem (Core, Models, Utils) within a single script by leveraging class-based namespaces and `__all...	04-10 20:25	Success	-	View
exp_self.20260410195844.005_20260410_195903 Paper: self.20260410195844.005	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260410195844.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-10 20:00	Success	-	View
exp_pytrain.20260410191054.005_20260410_191112 Paper: pytrain.20260410191054.005	Strictly Typed Source Distribution Builder This benchmark evaluates the generation of a Python build script that enforces strict type safety using standard library modules (`typing`, `dataclasses`). Overview The system must construct a valid `PackageMetadata` schema and a runtim...	04-10 19:12	Success	-	View
exp_self.20260410185055.004_20260410_185129 Paper: self.20260410185055.004	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260410185055.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-10 18:52	Success	-	View
exp_pytrain.20260410180458.004_20260410_180525 Paper: pytrain.20260410180458.004	Strictly Typed Configuration Manager Benchmark This benchmark evaluates your ability to construct a robust, single-file Python module that demonstrates professional packaging standards (PEP 8 compliance, import organization, module metadata) and utilizes Python's static typing system to...	04-10 18:06	Success	-	View
exp_self.20260410174513.003_20260410_174534 Paper: self.20260410174513.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260410174513.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-10 17:46	Success	-	View
exp_pytrain.20260410165732.003_20260410_165753 Paper: pytrain.20260410165732.003	Python Skill Fallback Title: Strictly Typed Configuration Loader with Module Encapsulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-10 16:58	Success	-	View
exp_self.20260410163757.002_20260410_163836 Paper: self.20260410163757.002	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260410163757.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-10 16:39	Success	-	View
exp_pytrain.20260410154958.002_20260410_155027 Paper: pytrain.20260410154958.002	Python Reliability Drill: Advanced Typing & Generics This repository contains a coding benchmark designed to test advanced Python typing capabilities, specifically leveraging PEP 695 (Type Parameter Syntax) introduced in Python 3.12. Objective Implement a generic `Pipeline` class that enforce...	04-10 15:51	Success	-	View
exp_self.20260410153050.001_20260410_153116 Paper: self.20260410153050.001	Benchmark for SSM Strategy: Stress Test Overview This benchmark evaluates the SSM Strategy Stress Test, comparing a standard dense processing approach against an optimized SSM-inspired implementation featuring disciplined memory policies, caching, and dynamic precision (bf16)...	04-10 15:32	Success	-	View
exp_pytrain.20260410144330.001_20260410_144415 Paper: pytrain.20260410144330.001	Type-Safe Plugin Architecture Simulator Benchmark This benchmark validates the capability of an autonomous system to dynamically generate Python package structures, implement strict typing protocols using `typing.Protocol` and `typing.TypeVar`, and perform runtime module discovery and load...	04-10 14:45	Success	-	View
exp_pytrain.20260410140132.025_20260410_140159 Paper: pytrain.20260410140132.025	Dynamic Plugin Loader with Strict Type Validation Overview This coding drill tests the hypothesis that a robust Python system can dynamically construct local package structures at runtime, strictly define interface contracts using `typing.Protocol`, and utilize `importlib` to load and vali...	04-10 14:03	Success	-	View
exp_self.20260410134129.024_20260410_134158 Paper: self.20260410134129.024	SSM Strategy Stress Test Benchmark This repository contains a benchmark designed to test the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (specifically chunked state management and hardware-aware cache utilization) improves inference th...	04-10 13:43	Success	-	View
exp_pytrain.20260410125519.024_20260410_125602 Paper: pytrain.20260410125519.024	Python Skill Fallback Title: Strictly Typed Module Architecture: Configuration Validator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-10 12:57	Success	-	View
exp_self.20260410123602.023_20260410_123618 Paper: self.20260410123602.023	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the memory efficiency and throughput of a Selective State Space Model (SSM) strategy against a standard Transformer baseline. Innovation Abstract Hypothesis: Applying SSM with disciplined memory policy improves...	04-10 12:37	Success	-	View
exp_pytrain.20260410114924.023_20260410_114949 Paper: pytrain.20260410114924.023	Python Skill Fallback Title: Type-Safe Plugin Registry with Dynamic Imports - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-10 11:50	Success	-	View
exp_self.20260410112808.022_20260410_112829 Paper: self.20260410112808.022	SSM Strategy Stress Test Benchmark This benchmark evaluates whether applying State Space Models (SSM) with a disciplined memory policy improves throughput under 8GB VRAM constraints. Overview The benchmark compares two implementations: 1. Baseline SSM: Standard implement...	04-10 11:29	Success	-	View
exp_pytrain.20260410103325.022_20260410_103402 Paper: pytrain.20260410103325.022	Generic Datastore using PEP 695 Type Parameters Benchmark This benchmark evaluates a Python 3.12+ implementation of a type-safe Key-Value Store utilizing PEP 695 Type Parameter Syntax. Hypothesis Adopting Python 3.12's `class Class[T]:` and `type Alias = ...` syntax significantly reduces syntactic...	04-10 10:35	Success	-	View
exp_self.20260410101117.021_20260410_101139 Paper: self.20260410101117.021	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying a State Space Model (SSM) with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to a naive baseline implementation. Hypothesis Applying SSM with discip...	04-10 10:12	Success	-	View
exp_pytrain.20260410091951.021_20260410_092012 Paper: pytrain.20260410091951.021	Strict Typed Package Scaffolder Overview This benchmark evaluates an autonomous coding agent's ability to synthesize a utility that bridges abstract type definitions with concrete filesystem operations. The goal is to generate a standards-compliant Python project structur...	04-10 09:21	Success	-	View
exp_self.20260410085852.020_20260410_085922 Paper: self.20260410085852.020	SSM Strategy Stress Test Benchmark This repository contains a standalone benchmark designed to evaluate the efficiency of State Space Models (SSMs) against standard Transformer architectures under memory-constrained scenarios (8GB VRAM limit). Hypothesis Applying SSMs with a...	04-10 09:00	Success	-	View
exp_pytrain.20260410080757.020_20260410_080827 Paper: pytrain.20260410080757.020	Python Skill Fallback Title: Robust Plugin Loader with Strict Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-10 08:09	Success	-	View
exp_hf_2604.08120_20260410_075547 Paper: hf_2604.08120	Benchmark: Adaptive Token Allocation (ATA) for Long Video Understanding This benchmark simulates the Tempo framework for efficient long-video understanding. It tests the core hypothesis: that a Small Vision-Language Model (SVLM) acting as a query-aware compressor can drastically reduce VRAM usage while main...	04-10 07:56	Success	-	View
exp_pytrain.20260410073453.019_20260410_073513 Paper: pytrain.20260410073453.019	Python Skill Fallback Title: Type-Safe Plugin Registry and Configuration Validator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-10 07:36	Success	-	View
exp_self.20260410071430.019_20260410_071452 Paper: self.20260410071430.019	SSM Strategy Stress Test: Disciplined Memory Policy Benchmark Overview This benchmark evaluates the performance of a State Space Model (SSM) under constrained memory conditions (8GB VRAM target). It compares a Baseline (standard FP32) against an Optimized variant that applies a disciplined mem...	04-10 07:16	Success	-	View
exp_pytrain.20260410062427.018_20260410_062453 Paper: pytrain.20260410062427.018	Strictly-Typed Metadata Validator and Plugin Loader This benchmark demonstrates a robust, zero-dependency package management system implementation using Python's advanced static typing features. Hypothesis Leveraging Python's advanced static typing features (`Protocol`, `TypeGuard`, and `Gen...	04-10 06:25	Success	-	View
exp_self.20260410060220.018_20260410_060252 Paper: self.20260410060220.018	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260410060220.018 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-10 06:03	Success	-	View
exp_pytrain.20260410050440.017_20260410_050517 Paper: pytrain.20260410050440.017	Type-Safe Plugin Loader Simulation This benchmark demonstrates the capability of an autonomous coding system to leverage Python's `typing` and `inspect` modules to construct a runtime plugin loader that enforces strict interface compliance. Hypothesis: An autonomous syst...	04-10 05:06	Success	-	View
exp_self.20260410043647.017_20260410_043726 Paper: self.20260410043647.017	SSM Strategy Stress Test This benchmark compares a standard Transformer-based architecture against an SSM (State Space Model) variant optimized with a disciplined memory policy and dynamic precision. The objective is to validate the hypothesis that SSMs with strict...	04-10 04:38	Success	-	View
exp_pytrain.20260410032856.016_20260410_032928 Paper: pytrain.20260410032856.016	Benchmark: Robust Dynamic Plugin Loader with Protocol Validation Objective This benchmark validates a Python engineer's ability to construct a secure, dynamic plugin system. It demonstrates the bridge between Python's runtime import machinery (`importlib`) and its static type hinting system (`typing.Prot...	04-10 03:30	Success	-	View
exp_self.20260410030024.016_20260410_030116 Paper: self.20260410030024.016	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260410030024.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-10 03:02	Success	-	View
exp_pytrain.20260410015343.015_20260410_015415 Paper: pytrain.20260410015343.015	Python Reliability Drill: Type-Safe Container Benchmark This benchmark tests the implementation of a robust, generic `TypeSafeContainer` utility. The goal is to demonstrate proficiency with Python's type hinting system (PEP 484), runtime type enforcement, and error handling without relying on ex...	04-10 01:55	Success	-	View
exp_self.20260410012423.015_20260410_012459 Paper: self.20260410012423.015	Self-directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the Memory-Disciplined SSM innovation against a standard baseline. The hypothesis is that applying a disciplined memory policy (chunking and explicit cache management) to State Space Models (SSM) improv...	04-10 01:26	Success	-	View
exp_pytrain.20260410002315.014_20260410_002405 Paper: pytrain.20260410002315.014	Strict Package Type Auditor Overview This benchmark provides a self-contained Python script that implements a static analysis tool for auditing Python packages. The tool, `audit_pkg.py` (implemented as a core function within `benchmark.py`), inspects a given directory...	04-10 00:25	Success	-	View
exp_self.20260409235854.014_20260409_235923 Paper: self.20260409235854.014	Self-directed Benchmark: SSM Strategy Stress Test 1. Overview This benchmark evaluates the memory efficiency and throughput of State Space Model (SSM) strategies compared to traditional Transformer attention mechanisms under strict constraints (simulated 8GB VRAM limit). The innovation...	04-10 00:00	Success	-	View
exp_pytrain.20260409225445.013_20260409_225545 Paper: pytrain.20260409225445.013	Python Skill Fallback Title: Strictly Typed Plugin Registry with Semantic Versioning - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-09 22:56	Success	-	View
exp_self.20260409222703.013_20260409_222804 Paper: self.20260409222703.013	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260409222703.013 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-09 22:29	Success	-	View
exp_pytrain.20260409212103.012_20260409_212134 Paper: pytrain.20260409212103.012	Type-Safe Dynamic Plugin Registry Benchmark This benchmark tests a Python developer's ability to implement a robust, extensible architecture using Python's `typing` module for Protocols and `importlib` for dynamic runtime discovery. Problem Description Modern Python frameworks often...	04-09 21:22	Success	-	View
exp_self.20260409205521.012_20260409_205545 Paper: self.20260409205521.012	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260409205521.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-09 20:56	Success	-	View
exp_self.20260409200913.011_20260409_200932 Paper: self.20260409200913.011	Self-directed benchmark: SSM strategy stress test Overview This benchmark evaluates the impact of a disciplined memory policy (Dynamic Precision) on a State Space Model (SSM) architecture similar to Mamba. The goal is to validate if aggressive memory optimization improves throughput under...	04-09 20:10	Success	-	View
exp_pytrain.20260409195603.011_20260409_195646 Paper: pytrain.20260409195603.011	Benchmark: Typed CLI Log Filter This benchmark evaluates a Python coding system's ability to generate a structured, robust Python module that adheres to modern packaging and typing standards while functioning as both a library and a command-line interface. Objective The s...	04-09 19:57	Success	-	View
exp_self.20260409193726.010_20260409_193756 Paper: self.20260409193726.010	SSM Strategy Stress Test This benchmark evaluates the "SSM Strategy" hypothesis: that using State Space Models (SSMs) with a disciplined memory policy significantly improves throughput and reduces VRAM usage compared to standard attention-based baselines when opera...	04-09 19:39	Success	-	View
exp_pytrain.20260409185734.010_20260409_185755 Paper: pytrain.20260409185734.010	Robust Asynchronous Plugin Loader This benchmark evaluates the design of a strict, type-safe asynchronous plugin system using only the Python standard library. Objectives 1. Protocol Enforcement: Demonstrate the use of `typing.Protocol` to define structural subtyping (d...	04-09 18:59	Success	-	View
exp_self.20260409183848.009_20260409_183932 Paper: self.20260409183848.009	SSM Strategy Stress Test This benchmark evaluates the performance of a Selective State Space Model (SSM) architecture under constrained memory conditions. Objective To validate the hypothesis that a disciplined memory policy (utilizing `torch.compile` kernel fusion...	04-09 18:40	Success	-	View
exp_2604.07350v1_20260409_180410 Paper: 2604.07350v1	Fast Spatial Memory with Elastic Test-Time Training Paper ID: 2604.07350v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-09 18:05	Success	-	View
exp_pytrain.20260409174610.009_20260409_174629 Paper: pytrain.20260409174610.009	Python Skill Fallback Title: Dynamic Type-Verified Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-09 17:47	Success	-	View
exp_self.20260409172843.008_20260409_172909 Paper: self.20260409172843.008	README: Self-directed SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that a disciplined SSM (State Space Model) memory policy improves throughput under strict memory constraints (specifically targeting < 8GB VRAM usage) compared to a standard Transformer-style...	04-09 17:30	Success	-	View
exp_pytrain.20260409164600.008_20260409_164626 Paper: pytrain.20260409164600.008	Python Skill Fallback Title: Dynamic Plugin Registry with Type-Safe Discovery - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-09 16:47	Success	-	View
exp_self.20260409162725.007_20260409_162756 Paper: self.20260409162725.007	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260409162725.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-09 16:28	Success	-	View
exp_pytrain.20260409154439.007_20260409_154458 Paper: pytrain.20260409154439.007	Python Skill Fallback Title: Generic Type-Safe Component Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-09 15:46	Success	-	View
exp_self.20260409152700.006_20260409_152722 Paper: self.20260409152700.006	SSM Strategy Stress Test Objective: Evaluate the performance impact of a disciplined State Space Model (SSM) memory policy against a standard attention-based baseline under strict 8GB VRAM constraints. Hypothesis: Applying SSM with disciplined memory policy...	04-09 15:28	Success	-	View
exp_hf_2604.05643_20260409_145252 Paper: hf_2604.05643	Graph-Based Chain-of-Thought Pruning Benchmark This benchmark evaluates the efficiency gains of the proposed Graph-Based CoT Pruning framework. The innovation targets the reduction of "Indiscriminate" and "Repetitive" reflections in Large Language Models (LLMs) by converting linear...	04-09 14:53	Success	-	View
exp_pytrain.20260409143402.006_20260409_143428 Paper: pytrain.20260409143402.006	Dynamic Module Loader with Runtime Protocol Verification This benchmark tests the ability to dynamically compile, load, and validate Python modules from source code strings at runtime. It simulates a plugin architecture where untrusted code must be strictly verified against a `typing.Protocol` be...	04-09 14:35	Success	-	View
exp_self.20260409141422.005_20260409_141502 Paper: self.20260409141422.005	Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying a State Space Model (SSM) strategy with a disciplined memory policy (specifically, a Mamba-inspired selective scan) significantly improves throughput and reduces VRAM footprint compa...	04-09 14:16	Success	-	View
exp_pytrain.20260409133428.005_20260409_133446 Paper: pytrain.20260409133428.005	Dynamic Type-Verified Package Loader This benchmark demonstrates the creation of a robust, autonomous plugin loading system using Python's standard library. Objective The goal is to simulate a dynamic extension system where: 1. A temporary Python package is generated programma...	04-09 13:35	Success	-	View
exp_self.20260409131636.004_20260409_131657 Paper: self.20260409131636.004	Self-Directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the impact of a disciplined memory policy (Dynamic Precision + Cache Management) on State Space Models (SSM) under tight VRAM constraints (targeting < 8GB). Hypothesis Applying an SSM with a disciplined memory polic...	04-09 13:18	Success	-	View
exp_pytrain.20260409123802.004_20260409_123819 Paper: pytrain.20260409123802.004	Strictly Typed Generic Data Processor This benchmark evaluates the implementation of a robust, reusable data processing component using Python's advanced static typing features. The focus is on creating a strictly typed library using `typing.Generic`, `typing.TypeVar`, and `typ...	04-09 12:39	Success	-	View
exp_self.20260409122132.003_20260409_122158 Paper: self.20260409122132.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260409122132.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-09 12:23	Success	-	View
exp_pytrain.20260409114310.003_20260409_114329 Paper: pytrain.20260409114310.003	Dynamic Plugin Loader with Protocol Enforcement This benchmark tests the ability to construct a modular, type-safe system using Python's standard library. It programmatically generates a Python plugin script on disk, utilizes `importlib` to load it into the runtime, validates the loaded...	04-09 11:44	Success	-	View
exp_self.20260409112139.002_20260409_112202 Paper: self.20260409112139.002	SSM Strategy Stress Test This benchmark evaluates the hypothesis that a State Space Model (SSM) architecture, specifically one mimicking the memory efficiency of `mamba`, achieves higher throughput than standard Transformer-style baselines when constrained to 8...	04-09 11:24	Success	-	View
exp_pytrain.20260409102502.002_20260409_102530 Paper: pytrain.20260409102502.002	Python Skill Fallback Title: Generic Repository with PEP 695 Syntax and Strict Encapsulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-09 10:26	Success	-	View
exp_self.20260409100210.001_20260409_100233 Paper: self.20260409100210.001	SSM Strategy Stress Test This benchmark evaluates the efficacy of a State Space Model (SSM) strategy against a standard Transformer baseline under strict memory constraints (8GB VRAM limit). Hypothesis Applying an SSM with a disciplined memory policy (state retenti...	04-09 10:03	Success	-	View
exp_pytrain.20260409090302.001_20260409_090334 Paper: pytrain.20260409090302.001	Python Skill Fallback Title: Type-Safe Dependency Introspection System - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-09 09:04	Success	-	View
exp_pytrain.20260409075940.114_20260409_075957 Paper: pytrain.20260409075940.114	Dynamic Plugin Loader with Strict Protocol Validation This benchmark tests the ability to implement a robust runtime module loader that simulates package dynamics by writing and importing modules programmatically, while enforcing strict type adherence using Python's `typing.Protocol` and `runt...	04-09 08:01	Success	-	View
exp_self.20260409073439.086_20260409_073513 Paper: self.20260409073439.086	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that a disciplined memory policy within a State Space Model (SSM) implementation improves throughput under 8GB VRAM constraints. The script compares a Baseline SSM (naive state accumulation) again...	04-09 07:36	Success	-	View
exp_pytrain.20260409063320.113_20260409_063357 Paper: pytrain.20260409063320.113	Runtime Package Constructor and Protocol Verifier Overview This benchmark evaluates an engineer's ability to dynamically construct Python packaging structures in-memory and enforce strict runtime type safety. The candidate must implement a `DynamicPackageLoader` class that simulates the lo...	04-09 06:35	Success	-	View
exp_hf_2604.04913_20260409_061848 Paper: hf_2604.04913	A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens Paper ID: hf_2604.04913 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-09 06:19	Success	-	View
exp_pytrain.20260409054728.112_20260409_054802 Paper: pytrain.20260409054728.112	Typed Plugin Registry & Configuration Loader Overview This benchmark evaluates the implementation of a robust, type-safe plugin registry system using only the Python standard library. It simulates the architecture patterns often seen in large-scale frameworks (like HuggingFace Transfo...	04-09 05:49	Success	-	View
exp_pytrain.20260409051458.111_20260409_051546 Paper: pytrain.20260409051458.111	Generic CLI Data Transformer with Strict Typing This coding drill focuses on constructing a robust Command Line Interface (CLI) tool for data transformation using Python's standard library. The objective is to implement a generic Extract, Transform, Load (ETL) pipeline utility that conve...	04-09 05:16	Success	-	View
exp_pytrain.20260409041903.110_20260409_042209 Paper: pytrain.20260409041903.110	Python Reliability Drill: Strict Typing & Performance Objective This benchmark evaluates your ability to write robust, type-safe Python code using standard library features only. It emphasizes strict type annotations (`typing` module), internal package structure, runtime validation, and perfor...	04-09 04:23	Success	-	View
exp_self.20260409024723.085_20260409_024941 Paper: self.20260409024723.085	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260409024723.085 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-09 02:50	Success	-	View
exp_pytrain.20260409005219.109_20260409_005239 Paper: pytrain.20260409005219.109	Protocol-Based Plugin Pipeline This benchmark demonstrates the use of Python's `typing.Protocol` with `@runtime_checkable` to create a flexible, type-safe plugin architecture. This architectural pattern enables structural subtyping (duck typing with static verification)...	04-09 00:53	Success	-	View
exp_hf_2604.06912_20260409_003812 Paper: hf_2604.06912	Q-Zoom: Query-Aware Adaptive Perception for Efficient Multimodal Large Language Models Paper ID: hf_2604.06912 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-09 00:39	Success	-	View
exp_pytrain.20260409001404.108_20260409_001428 Paper: pytrain.20260409001404.108	Robust Plugin Loader with Structural Typing Benchmark This benchmark evaluates the implementation of a robust plugin architecture using Python's standard library. It focuses on two advanced Python features: `typing.Protocol` for Structural Subtyping (Duck Typing) and `importlib` for dynamic mo...	04-09 00:15	Success	-	View
exp_self.20260408234520.084_20260408_234548 Paper: self.20260408234520.084	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy with a disciplined memory policy (inspired by Mamba architectures) significantly improves throughput and stabilizes VRAM usage under high-context const...	04-08 23:50	Success	-	View
exp_pytrain.20260408224742.107_20260408_224807 Paper: pytrain.20260408224742.107	Generic Data Normalizer Registry This project implements a robust, plugin-based architecture for data normalization using Python's `typing.Protocol` for structural subtyping. It demonstrates how to define generic interfaces and manage concrete implementations (plugins) wit...	04-08 22:49	Success	-	View
exp_pytrain.20260408221112.106_20260408_221221 Paper: pytrain.20260408221112.106	Python Skill Fallback Title: Type-Safe Component Registry with Dynamic Configuration - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-08 22:13	Success	-	View
exp_hf_2604.07023_20260408_214925 Paper: hf_2604.07023	MARS: Enabling Autoregressive Models Multi-Token Generation Paper ID: hf_2604.07023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-08 21:50	Success	-	View
exp_pytrain.20260408211723.105_20260408_211800 Paper: pytrain.20260408211723.105	Strictly Typed Plugin Registry with Runtime Protocol Enforcement Overview This benchmark tests the ability to design a strictly typed, modular plugin system using Python's standard library. The system utilizes `typing.Protocol` for interface definition and `runtime_checkable` for strict validation during...	04-08 21:19	Success	-	View
exp_pytrain.20260408204129.104_20260408_204210 Paper: pytrain.20260408204129.104	PEP 561 Compliant Package Scaffolder Overview This coding drill benchmark tests the ability to write a sophisticated CLI tool that generates a standards-compliant Python project structure. The tool must strictly adhere to PEP 517 (build system), PEP 621 (project metadata), and...	04-08 20:43	Success	-	View
exp_self.20260408200742.083_20260408_200808 Paper: self.20260408200742.083	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the performance of State Space Models (SSMs) against traditional Transformer-style attention mechanisms under strict memory constraints. Hypothesis Applying an SSM with a disciplined memory policy improves throughpu...	04-08 20:09	Success	-	View
exp_pytrain.20260408190658.103_20260408_190726 Paper: pytrain.20260408190658.103	Standard Library Wheel Archiver Challenge: Implement a minimal PEP 427 Wheel packager using only the Python Standard Library. Objective: Create a self-contained Python script (`benchmark.py`) that takes a project directory, compiles source code (optional but good...	04-08 19:08	Success	-	View
exp_self.20260408184419.082_20260408_184448 Paper: self.20260408184419.082	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260408184419.082 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-08 18:45	Success	-	View
exp_pytrain.20260408175501.102_20260408_175525 Paper: pytrain.20260408175501.102	PEP 695 Generic Data Processor & Module API Design This benchmark validates the implementation of Python 3.12+ features, specifically PEP 695 (Type Parameter Syntax), within a robust data processing context. Problem Statement Legacy Python typing relies on verbose `Generic` inheritance and...	04-08 17:56	Success	-	View
exp_self.20260408173237.081_20260408_173319 Paper: self.20260408173237.081	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260408173237.081 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-08 17:34	Success	-	View
exp_pytrain.20260408162803.101_20260408_162826 Paper: pytrain.20260408162803.101	Dynamic Package Constructor and Type Introspector Hypothesis Combining `typing.TypedDict` for schema validation with `importlib` for dynamic module loading enables the creation of robust, self-validating package scaffolding utilities that strictly enforce typing standards at runtime. Goal...	04-08 16:29	Success	-	View
exp_self.20260408160543.080_20260408_160610 Paper: self.20260408160543.080	Self-directed SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (recurrent state management) significantly reduces VRAM usage and improves throughput compared to a naive "unrolled" implementa...	04-08 16:07	Success	-	View
exp_pytrain.20260408151421.100_20260408_151445 Paper: pytrain.20260408151421.100	Strictly Typed 1D Tensor Module Overview This coding drill implements a robust, strictly typed 1-dimensional Tensor (Vector) library using pure Python standard library features. The core objective is to demonstrate advanced Python typing mechanisms, specifically **Generic...	04-08 15:15	Success	-	View
exp_self.20260408145359.079_20260408_145413 Paper: self.20260408145359.079	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260408145359.079 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-08 14:55	Success	-	View
exp_pytrain.20260408140115.099_20260408_140149 Paper: pytrain.20260408140115.099	Generic Model Registry with Type-Safety This drill demonstrates the creation of a robust, type-safe component registry using Python's `typing` module. Learning Objectives * Protocol Definition: Define strict interfaces using `typing.Protocol` that enforce structural subtyping...	04-08 14:02	Success	-	View
exp_self.20260408133955.078_20260408_134024 Paper: self.20260408133955.078	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies improves throughput under constrained VRAM (8GB target). Overview The test compares a standard Transformer-style atten...	04-08 13:41	Success	-	View
exp_pytrain.20260408125053.098_20260408_125116 Paper: pytrain.20260408125053.098	Benchmark: Protocol-Based Dynamic Plugin Loader Design Brief: The objective of this coding drill is to engineer a robust, runtime-safe plugin loading system. The solution must generate a temporary package structure containing varied plugin definitions (valid, invalid, and broken) and...	04-08 12:52	Success	-	View
exp_self.20260408122028.077_20260408_122139 Paper: self.20260408122028.077	SSM Memory Policy Stress Test This benchmark evaluates the hypothesis that applying a State Space Model (SSM) strategy with a disciplined memory policy (specifically utilizing dynamic precision and efficient state caching) improves throughput under constrained VRAM...	04-08 12:24	Success	-	View
exp_pytrain.20260408111422.097_20260408_111509 Paper: pytrain.20260408111422.097	Strictly Typed Dynamic Module Loader Overview This benchmark demonstrates a robust Python application architecture that dynamically loads standard library modules at runtime. It enforces type safety constraints using `typing.Protocol` and `@runtime_checkable`, ensuring that dy...	04-08 11:16	Success	-	View
exp_self.20260408104359.076_20260408_104451 Paper: self.20260408104359.076	SSM Strategy Stress Test Benchmark This benchmark evaluates the effectiveness of memory optimization strategies in State Space Models (SSMs) under constrained memory conditions (8GB VRAM). Overview The benchmark compares two SSM implementations: 1. Baseline: A standard S...	04-08 10:48	Success	-	View
exp_pytrain.20260408094121.096_20260408_094146 Paper: pytrain.20260408094121.096	README: Strictly Typed Dynamic Plugin Loader Benchmark Objective This benchmark validates the hypothesis that an autonomous system can dynamically discover Python modules at runtime and strictly enforce interface compliance using Structural Sub-typing (Protocols) rather than explicit inheritanc...	04-08 09:42	Success	-	View
exp_self.20260408091606.075_20260408_091627 Paper: self.20260408091606.075	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260408091606.075 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-08 09:17	Success	-	View
exp_pytrain.20260408081853.095_20260408_081927 Paper: pytrain.20260408081853.095	Python Skill Fallback Title: Generic Type-Safe Event Bus with Strict API - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-08 08:20	Success	-	View
exp_self.20260408075326.074_20260408_075358 Paper: self.20260408075326.074	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy provides superior throughput compared to traditional Attention mechanisms under strict memory constraints (simulated 8GB VRAM limit). Instructions 1. **Dependen...	04-08 07:55	Success	-	View
exp_pytrain.20260408065610.094_20260408_065638 Paper: pytrain.20260408065610.094	Dynamic Type-Verified Package Scaffolder Overview This benchmark evaluates the ability of a coding agent or engineer to programmatically construct a valid Python package structure on the filesystem, populate it with modules containing strict Type Hints, and dynamically load and ve...	04-08 06:57	Success	-	View
exp_self.20260408062903.073_20260408_062925 Paper: self.20260408062903.073	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies (specifically the constant-memory recurrence found in architectures like Mamba) improves throughput and reduces VRAM pressure compared to standard Atte...	04-08 06:30	Success	-	View
exp_pytrain.20260408052934.093_20260408_053032 Paper: pytrain.20260408052934.093	Type-Safe Plugin Architecture with Runtime Discovery This benchmark demonstrates the implementation of a robust, extensible plugin system using Python's `typing.Protocol` and `inspect` module. It simulates a library core (like vLLM or PyTorch) that dynamically discovers and validates model im...	04-08 05:31	Success	-	View
exp_pytrain.20260408045130.092_20260408_045341 Paper: pytrain.20260408045130.092	Generic Plugin Registry Benchmark This benchmark evaluates the implementation of a type-safe, extensible Plugin Registry system using Python's advanced static typing features. Objective Create a `benchmark.py` script that simulates a robust package structure (using `__all__...	04-08 04:54	Success	-	View
exp_pytrain.20260408031557.091_20260408_031754 Paper: pytrain.20260408031557.091	Strictly Typed Modular Data ETL Framework This benchmark tests your ability to architect a robust, single-file Python script that simulates a package structure using advanced typing features (`typing.Protocol`, `typing.TypeVar`, `typing.Generic`) and standard library introspection...	04-08 03:18	Success	-	View
exp_pytrain.20260408021119.090_20260408_021207 Paper: pytrain.20260408021119.090	Strictly Typed Async Event Dispatcher Benchmark This benchmark tests the implementation of a generic, strictly-typed asynchronous event dispatcher using Python's standard `asyncio` and `typing` libraries. Goal Create a single-file Python module (`benchmark.py`) that acts as a standalone...	04-08 02:13	Success	-	View
exp_pytrain.20260408011910.089_20260408_012110 Paper: pytrain.20260408011910.089	Benchmark: Runtime Plugin System with Protocol Validation Design Brief This benchmark tests an autonomous system's ability to integrate Python's dynamic module loading capabilities (`importlib`) with static type enforcement (`typing.Protocol`). The system must construct a robust, extensible archit...	04-08 01:22	Success	-	View
exp_pytrain.20260408003923.088_20260408_003952 Paper: pytrain.20260408003923.088	Python Skill Fallback Title: Generic Plugin Loader & Dynamic Package Validator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-08 00:40	Success	-	View
exp_self.20260408001429.072_20260408_001458 Paper: self.20260408001429.072	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260408001429.072 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-08 00:16	Success	-	View
exp_pytrain.20260407231114.087_20260407_231204 Paper: pytrain.20260407231114.087	Python Skill Fallback Title: Strictly-Typed Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-07 23:13	Success	-	View
exp_self.20260407223256.071_20260407_223345 Paper: self.20260407223256.071	SSM Strategy Stress Test: Memory Policy Benchmark Overview This benchmark evaluates the hypothesis that applying a disciplined memory policy (specifically gradient checkpointing and state-space tiling) to State Space Models (SSMs) improves throughput under strict hardware constraints (...	04-07 22:44	Success	-	View
exp_pytrain.20260407210023.086_20260407_210057 Paper: pytrain.20260407210023.086	Dynamic Plugin Loader with Protocol Validation Overview This benchmark tests the ability to construct a robust, type-safe dynamic import mechanism using Python's standard library. The script programmatically generates a package structure on disk, enforces interface compliance via `typin...	04-07 21:02	Success	-	View
exp_self.20260407203054.070_20260407_203122 Paper: self.20260407203054.070	Self-directed benchmark: ssm strategy stress test Overview This benchmark evaluates the Hypothesis: applying ssm with disciplined memory policy improves throughput under 8GB constraints. It compares two distinct modes of processing a long sequence: 1. Baseline (Naive SSM): Processe...	04-07 20:32	Success	-	View
exp_pytrain.20260407193507.085_20260407_193537 Paper: pytrain.20260407193507.085	Typed Configuration Micro-Package Overview This benchmark evaluates the ability of an autonomous coding system to design a robust, reusable library module within a single Python file. The task requires combining strong static typing (using Protocols and Generics) with packa...	04-07 19:36	Success	-	View
exp_self.20260407191353.069_20260407_191438 Paper: self.20260407191353.069	SSM Strategy Stress Test: Disciplined Memory Policy This benchmark evaluates the impact of a disciplined memory policy on State Space Model (SSM) throughput under constrained VRAM conditions (8GB target). Hypothesis Applying an SSM with a disciplined memory policy (chunked state inference) i...	04-07 19:15	Success	-	View
exp_pytrain.20260407182706.084_20260407_182728 Paper: pytrain.20260407182706.084	Robust Typed CLI Factory An autonomous system can engineer a reusable command-line interface factory that dynamically maps input arguments to a typed configuration class using Python's standard introspection libraries, ensuring strict type safety without external d...	04-07 18:28	Success	-	View
exp_self.20260407180749.068_20260407_180832 Paper: self.20260407180749.068	SSM Strategy Stress Test Overview This benchmark evaluates the "Mamba-style" SSM (State Space Model) strategy against a standard Transformer baseline under strict memory constraints. The goal is to validate the hypothesis that applying an SSM with a disciplined mem...	04-07 18:09	Success	-	View
exp_pytrain.20260407172351.083_20260407_172422 Paper: pytrain.20260407172351.083	Python Skill Fallback Title: Robust Typed Configuration Manager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-07 17:25	Success	-	View
exp_self.20260407170431.067_20260407_170452 Paper: self.20260407170431.067	SSM Strategy Stress Test Benchmark This benchmark evaluates the memory efficiency and throughput of a State Space Model (SSM) inference strategy when subjected to a disciplined chunking memory policy versus a naive full-sequence baseline. Objective The goal is to simulat...	04-07 17:05	Success	-	View
exp_pytrain.20260407161949.082_20260407_162015 Paper: pytrain.20260407161949.082	PEP 695 Generic Event Dispatcher Benchmark Overview This coding drill evaluates the implementation and performance of an Event Dispatcher system utilizing PEP 695 Type Parameter Syntax (introduced in Python 3.12). Objective Implement a type-safe, generic event dispatcher wit...	04-07 16:21	Success	-	View
exp_self.20260407155808.066_20260407_155832 Paper: self.20260407155808.066	This benchmark tests a synthetic SSM (State Space Model) against a standard Attention baseline to validate the hypothesi... Benchmark: SSM Strategy Stress Test Overview This script evaluates the memory efficiency and processing speed of a State Space Model (SSM) strategy compared to a standard Transformer Attention baseline. It simulates a "disciplined memory po...	04-07 16:00	Success	-	View
exp_pytrain.20260407151008.081_20260407_151031 Paper: pytrain.20260407151008.081	Python Skill Fallback Title: Strictly Typed Command Dispatcher with Package Metadata - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-07 15:11	Success	-	View
exp_self.20260407144925.065_20260407_144954 Paper: self.20260407144925.065	This repository contains a micro-benchmark designed to evaluate the efficiency gains of State Space Models (SSMs) with d... Objective The benchmark tests the hypothesis that applying SSM strategies (specifically mimicking the selective scan mechanisms of Mamba architectures) significantly improves throughput and reduces VRAM pressure when processing long sequenc...	04-07 14:51	Success	-	View
exp_pytrain.20260407135941.080_20260407_140034 Paper: pytrain.20260407135941.080	Robust Type-Safe Quantization Kernel Benchmark This project demonstrates a simulation of a quantized linear layer often found in Large Language Models (LLMs), utilizing only the Python standard library. It focuses on strict static typing, package metadata structures, and type-safe opera...	04-07 14:01	Success	-	View
exp_self.20260407133703.064_20260407_133726 Paper: self.20260407133703.064	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260407133703.064 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-07 13:38	Success	-	View
exp_pytrain.20260407124905.079_20260407_124931 Paper: pytrain.20260407124905.079	Benchmark: Typed Model Registry & Public API Management This benchmark evaluates the implementation of a type-safe, modular component registry system using Python's standard library `typing` module. The goal is to demonstrate robust API design patterns often found in large-scale ML frameworks (l...	04-07 12:50	Success	-	View
exp_self.20260407122809.063_20260407_122840 Paper: self.20260407122809.063	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260407122809.063 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-07 12:29	Success	-	View
exp_pytrain.20260407113455.078_20260407_113538 Paper: pytrain.20260407113455.078	Type-Safe Plugin Loader Benchmark This project demonstrates a robust, type-safe plugin architecture using Python's standard library. It leverages `typing.Protocol` for structural subtyping (interface compliance without inheritance) and `typing.Generic` for a flexible, type-...	04-07 11:36	Success	-	View
exp_self.20260407111110.062_20260407_111135 Paper: self.20260407111110.062	This benchmark is designed to test the hypothesis that State Space Models (SSMs) with a strict memory discipline (linear... README.md SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the memory efficiency and throughput of a linear-complexity State Space Model (SSM) strategy against a quadratic-complexity Baseline Transformer attention mechan...	04-07 11:12	Success	-	View
exp_pytrain.20260407101030.077_20260407_101104 Paper: pytrain.20260407101030.077	Dynamic Module Loader and Protocol Verifier This coding drill validates a robust plugin architecture using Python's `typing.Protocol` for structural subtyping and `importlib` for runtime module discovery within an isolated file system environment. Scenario You are building an extensi...	04-07 10:12	Success	-	View
exp_self.20260407094036.061_20260407_094059 Paper: self.20260407094036.061	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the efficiency of a State Space Model (SSM) inference strategy against a standard Transformer attention baseline. The specific goal is to validate the hypothesis that a disciplined memory policy (inherent to the rec...	04-07 09:42	Success	-	View
exp_pytrain.20260407084328.076_20260407_084347 Paper: pytrain.20260407084328.076	Protocol-Based Dynamic Module Loader This benchmark evaluates the capability of an autonomous coding system to design a robust plugin architecture using Python's standard library. Objective To implement a dynamic module loading system that enforces strict interface compliance...	04-07 08:44	Success	-	View
exp_cr_10.3390_electronics15071535_20260407_083052 Paper: cr_10.3390_electronics15071535	Tac-Mamba: A Pose-Guided Cross-Modal State Space Model with Trust-Aware Gating for mmWave Radar Human Activity Recogniti... Paper ID: cr_10.3390_electronics15071535 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Rec...	04-07 08:31	Success	-	View
exp_pytrain.20260407080505.075_20260407_080526 Paper: pytrain.20260407080505.075	Generic Plugin Registry with PEP 695 Syntax Overview This benchmark evaluates a `PluginRegistry` system implementation leveraging Python 3.12's PEP 695 Type Parameter Syntax. It demonstrates the new generic class (`class MyClass[T]:`) and generic function (`def method :`) syntax...	04-07 08:06	Success	-	View
exp_self.20260407074112.060_20260407_074132 Paper: self.20260407074112.060	SSM Strategy Stress Test This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy with a disciplined memory policy maintains higher throughput and lower VRAM usage compared to standard Transformer-based attention mechanisms under constrained...	04-07 07:42	Success	-	View
exp_pytrain.20260407065013.074_20260407_065037 Paper: pytrain.20260407065013.074	Strictly-Typed Model Configuration Registry This benchmark validates the design of a robust, type-safe configuration system for Large Language Models (LLMs) using Python's standard `typing` module. It enforces strict structural subtyping (Protocols) and semantic type aliases to preve...	04-07 06:51	Success	-	View
exp_self.20260407062325.059_20260407_062359 Paper: self.20260407062325.059	Self-directed benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy significantly improves inference throughput (tokens/sec) and reduces VRAM usage compared to standard Transformer archi...	04-07 06:25	Success	-	View
exp_pytrain.20260407051001.073_20260407_051116 Paper: pytrain.20260407051001.073	Type-Safe Entry Point Registry Overview This benchmark evaluates a custom `PluginRegistry` implementation designed to mimic the robustness of frameworks like vLLM or PyTorch. It leverages Python's `typing.Protocol` and `runtime_checkable` decorators to create a type-safe...	04-07 05:12	Success	-	View
exp_hf_2604.02073_20260407_045001 Paper: hf_2604.02073	PLUME: Latent Reasoning Based Universal Multimodal Embedding Paper ID: hf_2604.02073 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-07 04:51	Success	-	View
exp_pytrain.20260407041942.072_20260407_042011 Paper: pytrain.20260407041942.072	Typed Configuration and Plugin Registry System This benchmark implements a robust, mini-framework for a typed plugin registry system using the Python standard library. It demonstrates the architectural patterns found in large-scale libraries like Hugging Face Transformers and Diffusers....	04-07 04:21	Success	-	View
exp_pytrain.20260407034536.071_20260407_034646 Paper: pytrain.20260407034536.071	Python Skill Fallback Title: Type-Safe CLI Application Builder - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-07 03:47	Success	-	View
exp_pytrain.20260407030816.070_20260407_030904 Paper: pytrain.20260407030816.070	Concurrent Dependency Graph Resolver Benchmark This benchmark tests the ability to design a robust, typed, asynchronous dependency resolution system. The candidate must implement a `resolve_dependencies` function that utilizes `asyncio` for concurrency and strictly adheres to `typing` p...	04-07 03:10	Success	-	View
exp_pytrain.20260407023110.069_20260407_023152 Paper: pytrain.20260407023110.069	Structural Subtyping Plugin Loader Benchmark This benchmark tests the ability to define strict structural interfaces using Python's `typing.Protocol` and implement a robust discovery mechanism for dynamically generated code modules. The candidate system must identify valid implementat...	04-07 02:32	Success	-	View
exp_pytrain.20260407015524.068_20260407_015705 Paper: pytrain.20260407015524.068	Python Skill Fallback Title: Generic Plugin Registry with CLI Entry Points - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-07 01:58	Success	-	View
exp_pytrain.20260407012135.067_20260407_012226 Paper: pytrain.20260407012135.067	Generic Namespace Manager with Protocol Enforcement Overview This coding drill focuses on advanced Python type hinting and structural subtyping. You are tasked with implementing a `PackageManager` that acts as a namespace registry. It must leverage `typing.Generic`, `typing.TypeVar`, and `ty...	04-07 01:23	Success	-	View
exp_pytrain.20260407004901.066_20260407_004927 Paper: pytrain.20260407004901.066	Python Skill Fallback Title: In-Memory Plugin Architecture with Runtime Type Verification - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-07 00:50	Success	-	View
exp_self.20260407001121.058_20260407_001153 Paper: self.20260407001121.058	Self-directed benchmark: ssm strategy stress test This benchmark evaluates the memory efficiency and throughput of two distinct processing strategies under strict 8GB VRAM constraints: 1. Ablated Variant (Baseline): Simulates a "Global Attention" or "Full Cache" strategy. This model na...	04-07 00:18	Success	-	View
exp_pytrain.20260406231332.065_20260406_231357 Paper: pytrain.20260406231332.065	Dynamic Protocol-Compliant Plugin Loader This coding drill validates the ability to dynamically construct Python packages on a filesystem, load them using low-level `importlib` introspection tools, and enforce structural subtyping using `typing.Protocol`. Objective The candidate m...	04-06 23:15	Success	-	View
exp_hf_2604.04921_20260406_225822 Paper: hf_2604.04921	TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper ID: hf_2604.04921 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-06 22:59	Success	-	View
exp_pytrain.20260406222533.064_20260406_222600 Paper: pytrain.20260406222533.064	Python Skill Fallback Title: Strictly-Typed Package Configuration Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-06 22:27	Success	-	View
exp_self.20260406215309.057_20260406_215352 Paper: self.20260406215309.057	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the "Disciplined Memory Policy" hypothesis for State Space Models (SSMs). It compares a standard full-precision SSM implementation against an optimized variant utilizing dynamic precision and memory checkpo...	04-06 21:55	Success	-	View
exp_pytrain.20260406204123.063_20260406_204146 Paper: pytrain.20260406204123.063	Generic Async Task Dispatcher with Protocol Enforcement This benchmark implements an asynchronous task processing system using Python's `typing.Protocol`, `typing.Generic`, and `asyncio`. It demonstrates a modular architecture where strict type contracts are enforced to ensure data safety and ro...	04-06 20:42	Success	-	View
exp_self.20260406201612.056_20260406_201653 Paper: self.20260406201612.056	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that a Disciplined Memory Policy (selective state retention) in State Space Models (SSMs) significantly reduces VRAM usage while maintaining competitive throughput under strict 8GB constraints. Ov...	04-06 20:18	Success	-	View
exp_pytrain.20260406192440.062_20260406_192524 Paper: pytrain.20260406192440.062	Dynamic Generic Plugin Loader with PEP 695 Benchmark Overview This coding drill evaluates your ability to programmatically construct Python packages and utilize modern Python type systems (PEP 695). The script creates a temporary package structure on disk, injects source code using Python 3.1...	04-06 19:26	Success	-	View
exp_pytrain.20260406185011.061_20260406_185238 Paper: pytrain.20260406185011.061	Generic Plugin Loader with Runtime Type Validation This benchmark demonstrates a robust architectural pattern for building extensible Python applications. It utilizes `typing.Protocol` to define structural interfaces (contracts) that plugins must satisfy, and `importlib` to dynamically disc...	04-06 18:53	Success	-	View
exp_self.20260406182829.055_20260406_182851 Paper: self.20260406182829.055	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260406182829.055 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-06 18:29	Success	-	View
exp_pytrain.20260406173300.060_20260406_173333 Paper: pytrain.20260406173300.060	Python Skill Fallback Title: Strictly-Typed Backend Dispatcher with Dynamic Discovery - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-06 17:34	Success	-	View
exp_hf_2604.01609_20260406_172139 Paper: hf_2604.01609	Swift-SVD: Low-Rank LLM Compression Benchmark This benchmark evaluates the performance characteristics of Swift-SVD, a novel activation-aware compression framework. Specifically, it measures the VRAM reduction, Inference Throughput (Tokens/sec), and Compression Speed wh...	04-06 17:22	Success	-	View
exp_pytrain.20260406165827.059_20260406_165856 Paper: pytrain.20260406165827.059	Python Skill Fallback Title: Generic Component Registry with Simulated Sub-Module Registration - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-06 16:59	Success	-	View
exp_self.20260406163742.054_20260406_163802 Paper: self.20260406163742.054	Self-directed Benchmark: SSM Strategy Stress Test Hypothesis Applying SSM (State Space Model) architectures with a disciplined memory policy (specifically dynamic precision and compilation) improves throughput under 8GB VRAM constraints compared to a standard baseline configuration. Plan W...	04-06 16:39	Success	-	View
exp_pytrain.20260406154951.058_20260406_155021 Paper: pytrain.20260406154951.058	Generic Plugin Registry and CLI Dispatcher Challenge Overview This benchmark tests the ability to architect a robust, type-safe plugin system using Python's advanced `typing` features. The candidate must implement a generic command registry and a dispatcher that can handle different...	04-06 15:51	Success	-	View
exp_self.20260406151636.053_20260406_151707 Paper: self.20260406151636.053	SSM Strategy Stress Test This benchmark evaluates the memory efficiency and throughput performance of a State Space Model (SSM) strategy against a standard Dense baseline. It simulates a scenario with a large sequence length to stress GPU memory constraints (8GB li...	04-06 15:28	Success	-	View
exp_pytrain.20260406142735.057_20260406_142826 Paper: pytrain.20260406142735.057	Python Skill Fallback Title: Strictly-Typed Generic Data Pipeline with CLI Entry Point - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-06 14:29	Success	-	View
exp_self.20260406140413.052_20260406_140510 Paper: self.20260406140413.052	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260406140413.052 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-06 14:06	Success	-	View
exp_pytrain.20260406130831.056_20260406_130903 Paper: pytrain.20260406130831.056	Python Skill Fallback Title: Type-Safe Plugin Architecture with Dynamic Discovery - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-06 13:10	Success	-	View
exp_self.20260406124920.051_20260406_124944 Paper: self.20260406124920.051	SSM Strategy Stress Test This benchmark evaluates the performance implications of a disciplined memory policy applied to State Space Models (SSMs). It compares a standard sequential implementation against an optimized variant that utilizes chunked processing and au...	04-06 12:50	Success	-	View
exp_pytrain.20260406115518.055_20260406_115551 Paper: pytrain.20260406115518.055	Programmatic Package Construction and Runtime Type Verification Overview This coding drill tests the ability to dynamically construct a valid Python package distribution (simulating a wheel/ZIP), inject it into the runtime, and perform runtime type verification using the `typing` module. Objective Creat...	04-06 11:56	Success	-	View
exp_hf_2604.03118_20260406_113833 Paper: hf_2604.03118	Benchmark for Salt: Self-Consistent Distribution Matching This benchmark evaluates the computational efficiency and memory footprint characteristics of the Salt algorithm proposals. Specifically, it simulates the overhead introduced by: 1. SC-DMD (Self-Consistent Distribution Matching): Th...	04-06 11:39	Success	-	View
exp_pytrain.20260406111214.054_20260406_111234 Paper: pytrain.20260406111214.054	Typed Metadata Discovery System Objective Design and implement a robust `DistributionScanner` class that utilizes Python's standard library `importlib.metadata` to perform introspection on installed packages. Requirements 1. Strict Typing: Utilize `typing.TypedDict` t...	04-06 11:13	Success	-	View
exp_self.20260406104834.050_20260406_104902 Paper: self.20260406104834.050	Benchmark: SSM Strategy Stress Test This benchmark evaluates the performance of State Space Models (SSM) under constrained VRAM environments (8GB limit). It compares a baseline SSM implementation against a variant employing dynamic precision and disciplined memory policies. I...	04-06 10:50	Success	-	View
exp_pytrain.20260406095148.053_20260406_095225 Paper: pytrain.20260406095148.053	Python Skill Fallback Title: Strictly-Typed Multi-Backend Dispatcher Simulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-06 09:53	Success	-	View
exp_self.20260406092333.049_20260406_092535 Paper: self.20260406092333.049	SSM Strategy Stress Test Benchmark This benchmark evaluates the performance of State Space Models (SSMs) with different memory management strategies, specifically testing if a disciplined memory policy improves throughput under 8GB VRAM constraints. Background State Space Mo...	04-06 09:26	Success	-	View
exp_pytrain.20260406081147.052_20260406_081231 Paper: pytrain.20260406081147.052	Robust Dynamic Plugin Loader using Structural Typing Overview This benchmark verifies the hypothesis that `typing.Protocol` with `@runtime_checkable` enables an autonomous system to dynamically verify and enforce interface compliance without explicit inheritance. The Challenge In modular plug...	04-06 08:13	Success	-	View
exp_pytrain.20260406073857.051_20260406_073939 Paper: pytrain.20260406073857.051	Type-Safe Dynamic Module Loader Benchmark This benchmark tests the ability to design a robust runtime type checking system using Python's `typing.Protocol`. It simulates a dynamic plugin loader where modules (represented as dictionaries) are inspected for structural compliance with...	04-06 07:40	Success	-	View
exp_self.20260406071042.048_20260406_071123 Paper: self.20260406071042.048	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260406071042.048 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-06 07:12	Success	-	View
exp_pytrain.20260406060536.050_20260406_060556 Paper: pytrain.20260406060536.050	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-06 06:06	Success	-	View
exp_self.20260406053301.047_20260406_053343 Paper: self.20260406053301.047	Self-directed benchmark: ssm strategy stress test This project implements a reproducible benchmark designed to test the hypothesis that applying SSM (State Space Model) strategies with a disciplined memory policy improves throughput under strict VRAM constraints (8GB). The Hypothesis We hy...	04-06 05:34	Success	-	View
exp_pytrain.20260406044142.049_20260406_044216 Paper: pytrain.20260406044142.049	Strict Configuration & Metadata Validator This coding drill evaluates the ability to enforce strict type safety in Python using `TypedDict` and `importlib` for runtime environment verification. Objective The candidate must implement a `PackageManifest` validator and an environment...	04-06 04:43	Success	-	View
exp_self.20260406041826.046_20260406_041851 Paper: self.20260406041826.046	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that a Disciplined Memory Policy—specifically utilizing Selective State Space Models (SSM) with Dynamic Precision and State Caching—improves throughput under strict VRAM constraints (s...	04-06 04:19	Success	-	View
exp_pytrain.20260406031856.048_20260406_031937 Paper: pytrain.20260406031856.048	Python Skill Fallback Title: Type-Safe Generic Storage Module - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-06 03:20	Success	-	View
exp_self.20260406024855.045_20260406_024930 Paper: self.20260406024855.045	Benchmark: SSM Strategy Stress Test This benchmark evaluates a synthetic Selective State Space Model (SSM) implementation to test memory policies. It compares an optimized configuration (utilizing dynamic precision and disciplined caching) against an ablated configuration (FP...	04-06 02:50	Success	-	View
exp_pytrain.20260406015132.047_20260406_015205 Paper: pytrain.20260406015132.047	Strictly-Typed Tensor Micro-Package CLI This module implements a minimalistic, strongly-typed Tensor micro-package using Python's standard `typing` generics. It demonstrates a domain-specific object design that enforces type consistency across numerical operations while adhering...	04-06 01:53	Success	-	View
exp_2604.03225v1_20260406_013957 Paper: 2604.03225v1	VOSR: Vision-Only Generative Model Benchmark This benchmark evaluates the inference performance of the VOSR (Vision-Only Super-Resolution) model architecture. VOSR distinguishes itself by relying purely on visual data for generation, employing a pretrained vision encoder for semantic...	04-06 01:40	Success	-	View
exp_pytrain.20260406011845.046_20260406_011904 Paper: pytrain.20260406011845.046	Typed Module Dependency Resolver Overview This coding drill benchmarks the creation of a robust dependency resolution mechanism. It emphasizes the use of Python's standard library `typing` module (specifically `TypedDict`) for explicit data structuring and `importlib` for...	04-06 01:20	Success	-	View
exp_self.20260406005911.044_20260406_010023 Paper: self.20260406005911.044	SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the performance of State Space Models (SSM) with and without memory optimization strategies, focusing on techniques inspired by Mamba architecture. The benchmark measures VRAM usage and tokens per second un...	04-06 01:01	Success	-	View
exp_pytrain.20260406001435.045_20260406_001503 Paper: pytrain.20260406001435.045	Python Skill Fallback Title: Robust Type-Safe Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-06 00:16	Success	-	View
exp_self.20260405235250.043_20260405_235317 Paper: self.20260405235250.043	SSM Strategy Stress Test This benchmark validates the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies improves throughput and efficiency under 8GB VRAM constraints. Overview The benchmark simulates two inference strategi...	04-05 23:54	Success	-	View
exp_pytrain.20260405225940.044_20260405_230006 Paper: pytrain.20260405225940.044	Strictly Typed Plugin System Benchmark This project demonstrates a high-performance, type-safe plugin architecture using Python's standard library. It combines structural subtyping (`typing.Protocol`) with dynamic module loading (`importlib`) to validate and execute plugin code...	04-05 23:01	Success	-	View
exp_self.20260405223743.042_20260405_223812 Paper: self.20260405223743.042	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260405223743.042 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-05 22:39	Success	-	View
exp_pytrain.20260405215000.043_20260405_215027 Paper: pytrain.20260405215000.043	Python Reliability Drill: Strict Typing & Runtime Validation This benchmark implements a robust utility class `StrictValidator` designed to enforce runtime type safety on complex data structures without external dependencies. It simulates the behavior of high-level validation libraries (like Pydantic...	04-05 21:51	Success	-	View
exp_self.20260405212935.041_20260405_212957 Paper: self.20260405212935.041	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput and reduces VRAM usage compared to standard attention mechanisms under strict memory constraints (simulatin...	04-05 21:31	Success	-	View
exp_pytrain.20260405204218.042_20260405_204233 Paper: pytrain.20260405204218.042	PEP 695 Generic Factory Benchmark This benchmark validates the implementation of a generic factory system using Python 3.12's Type Parameter Syntax (PEP 695). It enforces strict namespace management and Protocol-based constraints. Prerequisites - Python 3.12 or higher (Requ...	04-05 20:43	Success	-	View
exp_self.20260405202243.040_20260405_202303 Paper: self.20260405202243.040	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy significantly improves throughput under strict 8GB VRAM constraints. It contrasts a Baseline SSM (which may naively cache s...	04-05 20:25	Success	-	View
exp_pytrain.20260405193958.041_20260405_194018 Paper: pytrain.20260405193958.041	Runtime-Verified Plugin Architecture Benchmark This benchmark demonstrates an autonomous system's ability to programmatically construct a valid Python package structure on disk and enforce strict structural subtyping (Protocols) on dynamically discovered modules. Objective To test dynam...	04-05 19:41	Success	-	View
exp_self.20260405191951.039_20260405_192023 Paper: self.20260405191951.039	Self-directed benchmark: ssm strategy stress test Objective This benchmark evaluates the hypothesis that applying a Selective State Space Model (SSM) strategy with a disciplined memory policy improves inference throughput and reduces VRAM overhead compared to a standard Transformer-style K...	04-05 19:21	Success	-	View
exp_pytrain.20260405183353.040_20260405_183412 Paper: pytrain.20260405183353.040	Dynamic Kernel Dispatcher with Type Safety Overview This coding drill evaluates the ability to construct a robust plugin architecture similar to backend selection in deep learning frameworks (like PyTorch or LitGPT). The candidate must implement a dispatcher system using Python's `t...	04-05 18:35	Success	-	View
exp_self.20260405181427.038_20260405_181452 Paper: self.20260405181427.038	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260405181427.038 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-05 18:15	Success	-	View
exp_pytrain.20260405172750.039_20260405_172815 Paper: pytrain.20260405172750.039	Robust Plugin Registry with Version Compatibility Simulation Design Brief This coding drill assesses the ability to construct a generic, type-safe registry pattern similar to those found in large-scale frameworks like Transformers or vLLM. The benchmark simulates how these frameworks handle dynamic m...	04-05 17:29	Success	-	View
exp_self.20260405170718.037_20260405_170759 Paper: self.20260405170718.037	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy (specifically, the Mamba architecture) improves inference throughput and stabilizes VRAM usage under 8GB constraints co...	04-05 17:09	Success	-	View
exp_pytrain.20260405161902.038_20260405_161930 Paper: pytrain.20260405161902.038	Strictly-Typed Dynamic Plugin Loader Overview This benchmark demonstrates the use of Python's `typing.Protocol` for structural subtyping in a dynamic plugin loading system. Unlike nominal subtyping (Abstract Base Classes), Protocols allow class compatibility based on the prese...	04-05 16:20	Success	-	View
exp_self.20260405155424.036_20260405_155454 Paper: self.20260405155424.036	SSM Strategy Stress Test Benchmark This benchmark evaluates the efficacy of a State Space Model (SSM) memory strategy against a standard Transformer-style baseline. Specifically, it tests the hypothesis that a disciplined memory policy (constant-state recurrence) allows...	04-05 15:55	Success	-	View
exp_pytrain.20260405150811.037_20260405_150842 Paper: pytrain.20260405150811.037	Dynamic Type-Safe Plugin Loader Overview This coding drill benchmark implements a Dynamic Type-Safe Plugin Loader. The objective is to demonstrate how to use Python's `typing.Protocol` and `tempfile` to build a robust system for loading and verifying external code mod...	04-05 15:09	Success	-	View
exp_self.20260405144634.035_20260405_144659 Paper: self.20260405144634.035	SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the performance of State Space Models (SSMs) under strict memory constraints (simulating an 8GB VRAM limit). It compares a Naive Baseline implementation against an Optimized Policy variant that util...	04-05 14:48	Success	-	View
exp_pytrain.20260405135522.036_20260405_135546 Paper: pytrain.20260405135522.036	Strictly Typed Dynamic Plugin Loader Introduction This benchmark demonstrates a robust, zero-trust plugin architecture within a pure Python environment. It leverages Structural Subtyping (Protocols) to enforce interface compatibility at runtime without requiring shared bas...	04-05 13:56	Success	-	View
exp_self.20260405133626.034_20260405_133656 Paper: self.20260405133626.034	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates a Disciplined Memory Policy applied to a State Space Model (SSM) architecture. The objective is to test the hypothesis that selective state caching and chunk-based processing improve throughput and redu...	04-05 13:38	Success	-	View
exp_pytrain.20260405124901.035_20260405_124923 Paper: pytrain.20260405124901.035	Generic Plugin Architecture with Dynamic Discovery This benchmark demonstrates a robust, type-safe plugin architecture using Python's standard library. Objective The hypothesis is that an autonomous coding system can leverage Python's type system (specifically `typing.Protocol` and Generics...	04-05 12:50	Success	-	View
exp_self.20260405122950.033_20260405_123014 Paper: self.20260405122950.033	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260405122950.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-05 12:31	Success	-	View
exp_pytrain.20260405114356.034_20260405_114428 Paper: pytrain.20260405114356.034	Strictly-Typed Generic Dependency Resolver This coding drill validates your ability to write robust, type-safe Python code using advanced `typing` constructs (Generics, Protocols) and classical algorithms (Topological Sort). Objective Implement a generic package manager capable of r...	04-05 11:45	Success	-	View
exp_self.20260405112401.032_20260405_112421 Paper: self.20260405112401.032	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the SSM (State Space Model) strategy against a baseline attention mechanism under strict 8GB VRAM constraints. The core hypothesis is that applying an SSM with a disciplined memory policy (fixed...	04-05 11:25	Success	-	View
exp_pytrain.20260405103723.033_20260405_103752 Paper: pytrain.20260405103723.033	Python Skill Fallback Title: Generic Plugin Registry with Dynamic Namespace Simulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-05 10:38	Success	-	View
exp_self.20260405101406.031_20260405_101427 Paper: self.20260405101406.031	SSM Strategy Stress Test This benchmark evaluates the performance implications of applying a disciplined memory policy to State Space Model (SSM) architectures, specifically mimicking the Mamba selective state space approach. Hypothesis Applying SSM with a discipli...	04-05 10:15	Success	-	View
exp_pytrain.20260405091646.032_20260405_091717 Paper: pytrain.20260405091646.032	Type-Safe Plugin Registry Coding Drill This benchmark challenges the implementation of a modular, extensible application framework using Python's standard library type system. The objective is to construct a `ModelRunner` registry that allows for the dynamic registration and ret...	04-05 09:18	Success	-	View
exp_self.20260405085511.030_20260405_085535 Paper: self.20260405085511.030	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260405085511.030 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-05 08:56	Success	-	View
exp_pytrain.20260405080414.031_20260405_080441 Paper: pytrain.20260405080414.031	Type-Safe Dynamic Plugin Loader Benchmark Overview This benchmark evaluates a Python system's ability to synthesize standard library tools—specifically the `typing` and `inspect` modules—to create a robust, type-safe plugin architecture. The Challenge The goal is to implement a dyn...	04-05 08:05	Success	-	View
exp_self.20260405074154.029_20260405_074232 Paper: self.20260405074154.029	Self-directed benchmark: SSM Strategy Stress Test This repository contains a runnable benchmark designed to test the hypothesis that applying State Space Model (SSM) architectures with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to standard recurrent...	04-05 07:43	Success	-	View
exp_pytrain.20260405065622.030_20260405_065644 Paper: pytrain.20260405065622.030	Typed Asynchronous Plugin Loader A Python coding drill designed to test strict type adherence, packaging standards (PEP 8), and asynchronous concurrent execution capabilities within the standard library. Objective Build a robust, extensible plugin architecture where plugin...	04-05 06:57	Success	-	View
exp_self.20260405063601.028_20260405_063625 Paper: self.20260405063601.028	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that a disciplined memory policy applied to State Space Models (SSM) significantly improves throughput and reduces VRAM usage during high-load inference (simulating >8GB context scenarios). Requiremen...	04-05 06:37	Success	-	View
exp_pytrain.20260405054549.029_20260405_054611 Paper: pytrain.20260405054549.029	Strictly-Typed Dynamic Package Generator This benchmark evaluates a system's ability to programmatically synthesize a valid Python package structure. It verifies the system can write advanced static typing constructs (Protocols, Generics, ParamSpec) to disk, generate valid packagi...	04-05 05:47	Success	-	View
exp_self.20260405052336.027_20260405_052410 Paper: self.20260405052336.027	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260405052336.027 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-05 05:25	Success	-	View
exp_pytrain.20260405042503.028_20260405_042552 Paper: pytrain.20260405042503.028	Strict Generic Plugin Registry Benchmark This benchmark evaluates the performance and correctness of a strictly typed plugin system implemented using Python's `typing.Protocol` (PEP 544) and modern Type Parameter syntax (PEP 695). Design Overview The system defines a `Processo...	04-05 04:26	Success	-	View
exp_self.20260405040113.026_20260405_040207 Paper: self.20260405040113.026	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the performance impact of a disciplined memory policy applied to State Space Models (SSMs), specifically mimicking architectures like Mamba. The test compares a baseline implementation against an optimized...	04-05 04:03	Success	-	View
exp_pytrain.20260405030141.027_20260405_030219 Paper: pytrain.20260405030141.027	Coding Drill Benchmark: Strictly Typed Autograd Mini-Library Robust library architecture relies on strict separation between the public interface and private implementation, enforced by explicit `__all__` declarations and structural subtyping.	04-05 03:03	Success	-	View
exp_self.20260405023405.025_20260405_023438 Paper: self.20260405023405.025	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260405023405.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-05 02:35	Success	-	View
exp_pytrain.20260405013040.026_20260405_013116 Paper: pytrain.20260405013040.026	Strictly Typed Dynamic Plugin Registry Objective This benchmark demonstrates a robust plugin architecture using Python's `typing.Protocol` and `runtime_checkable` decorators. Unlike traditional ad-hoc duck typing (which assumes "if it walks like a duck, it's a duck" often leadin...	04-05 01:32	Success	-	View
exp_self.20260405010652.024_20260405_010722 Paper: self.20260405010652.024	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies improves throughput and reduces VRAM usage compared to standard Transformer-based approaches under constrained VRAM (8...	04-05 01:08	Success	-	View
exp_pytrain.20260405001558.025_20260405_001632 Paper: pytrain.20260405001558.025	Strict Protocol-Driven Plugin Loader with Metadata Introspection This benchmark evaluates the ability to construct an extensible plugin architecture using Python's `typing.Protocol`. It enforces strict runtime signature validation using `inspect` and `typing` modules to ensure interface compliance before...	04-05 00:17	Success	-	View
exp_self.20260404235228.023_20260404_235309 Paper: self.20260404235228.023	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput and reduces VRAM usage compared to standard architectures (specifically Attention-based models) under stric...	04-04 23:54	Success	-	View
exp_pytrain.20260404225531.024_20260404_225613 Paper: pytrain.20260404225531.024	Type-Safe Generic Batch Validator Module Benchmark This benchmark evaluates a Python module's ability to define and enforce strict type specifications using modern `typing` features (`Protocol`, `Generic`, `TypeVar`) and packaging standards (`__all__`). Benchmark Design The subject under te...	04-04 22:57	Success	-	View
exp_self.20260404223103.022_20260404_223134 Paper: self.20260404223103.022	Self-directed benchmark: SSM strategy stress test Hypothesis Applying SSM (State Space Model) logic with a disciplined memory policy (simulated here via `dynamic_precision` and efficient state `cache` management) significantly improves throughput and reduces VRAM footprint compared to...	04-04 22:32	Success	-	View
exp_pytrain.20260404213505.023_20260404_213532 Paper: pytrain.20260404213505.023	Strictly-Typed Dynamic Plugin Loader Overview This benchmark evaluates a system's ability to construct a robust, extensible architecture using Python's `typing.Protocol` for interface enforcement and `importlib` for runtime module discovery. Objective Develop a single-file scr...	04-04 21:36	Success	-	View
exp_self.20260404210403.021_20260404_210426 Paper: self.20260404210403.021	SSM Strategy Stress Test This benchmark evaluates the memory efficiency and throughput of State Space Model (SSM) layers when subjected to a disciplined memory policy (dynamic precision and chunked scanning) versus a naive full-precision baseline. Requirements - Py...	04-04 21:05	Success	-	View
exp_pytrain.20260404200851.022_20260404_200927 Paper: pytrain.20260404200851.022	PEP 695 Generic Dependency Resolver Drill Overview This benchmark evaluates your ability to implement generic algorithms using modern Python 3.12+ syntax. Specifically, it tests the implementation of a Type Parameter Syntax (PEP 695) class to perform dependency resolution on a...	04-04 20:10	Success	-	View
exp_self.20260404194638.020_20260404_194708 Paper: self.20260404194638.020	Self-directed benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the Memory Policy of State Space Models (SSMs) compared to standard dense linear transformations (simulating a Transformer block without attention or a standard MLP). The hypothesis is that the selectiv...	04-04 19:48	Success	-	View
exp_pytrain.20260404185708.021_20260404_185741 Paper: pytrain.20260404185708.021	Strictly Typed Dynamic Plugin Loader and Metadata Validator Overview This benchmark evaluates the use of Python's advanced type hinting features (specifically `NewType`, `TypedDict`, and `Protocol`) to construct a robust, strictly typed runtime plugin system. The Hypothesis An autonomous system can...	04-04 18:58	Success	-	View
exp_self.20260404183647.019_20260404_183713 Paper: self.20260404183647.019	SSM Strategy Stress Test This benchmark evaluates the memory efficiency and throughput of a State Space Model (SSM) strategy compared to a standard Attention-based baseline. The goal is to demonstrate that SSMs, utilizing a disciplined memory policy (constant state...	04-04 18:38	Success	-	View
exp_pytrain.20260404174545.020_20260404_174623 Paper: pytrain.20260404174545.020	Dynamic Type-Safe Plugin System This coding drill implements a self-contained benchmark for a robust, dynamic plugin architecture using only Python's standard library. Overview The system simulates a high-performance kernel loader (similar to PyTorch or Lightning backend...	04-04 17:47	Success	-	View
exp_self.20260404172449.018_20260404_172525 Paper: self.20260404172449.018	Self-directed benchmark: ssm strategy stress test This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy improves throughput under 8GB VRAM constraints. It compares a Baseline configuration against an Optimized configuration (discipl...	04-04 17:26	Success	-	View
exp_pytrain.20260404163602.019_20260404_163623 Paper: pytrain.20260404163602.019	Python Skill Fallback Title: Strictly Typed Backend Registry and Dependency Resolver - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-04 16:37	Success	-	View
exp_self.20260404161330.017_20260404_161353 Paper: self.20260404161330.017	Self-directed benchmark: ssm strategy stress test This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy improves throughput and reduces VRAM usage compared to standard autoregressive baselines under 8GB constraints. Concep...	04-04 16:17	Success	-	View
exp_pytrain.20260404152841.018_20260404_152902 Paper: pytrain.20260404152841.018	Runtime Plugin System with Structural Subtyping This benchmark implements a dynamic plugin loader that utilizes Python's `typing.Protocol` and `@runtime_checkable` to discover and validate modules at runtime without explicit inheritance. It demonstrates structural subtyping where classes...	04-04 15:30	Success	-	View
exp_self.20260404150914.016_20260404_150948 Paper: self.20260404150914.016	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the memory efficiency and throughput of State Space Models (SSM) compared to standard Attention-based mechanisms under high-sequence constraints. Hypothesis Applying SSM with a disciplined memory policy (co...	04-04 15:10	Success	-	View
exp_pytrain.20260404142316.017_20260404_142349 Paper: pytrain.20260404142316.017	Typed Extensibility: Protocol-Based Module Discovery README.md This benchmark evaluates an agent's ability to design a fault-tolerant plugin architecture using Python's `typing.Protocol` and dynamic module introspection. Objective Implement a module discovery system that: 1. Defines a strict...	04-04 14:24	Success	-	View
exp_self.20260404140340.015_20260404_140407 Paper: self.20260404140340.015	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260404140340.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-04 14:05	Success	-	View
exp_pytrain.20260404131811.016_20260404_131826 Paper: pytrain.20260404131811.016	Dynamic Type-Safe Plugin Loader This benchmark tests the ability to dynamically construct a Python package in memory and enforce strict typing contracts using `typing.Protocol`. Objective The script performs the following complex operations: 1. Protocol Definition: De...	04-04 13:19	Success	-	View
exp_self.20260404125656.014_20260404_125726 Paper: self.20260404125656.014	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260404125656.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-04 12:58	Success	-	View
exp_pytrain.20260404120709.015_20260404_120739 Paper: pytrain.20260404120709.015	Generic Dependency Container with Importlib Resolution This benchmark tests the ability to construct a robust, zero-dependency dependency injection system using modern Python 3.12 features. Hypothesis Utilizing PEP 695 Type Parameter Syntax and the `importlib` standard library module allows for...	04-04 12:08	Success	-	View
exp_self.20260404114703.013_20260404_114722 Paper: self.20260404114703.013	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the efficiency gains of State Space Models (SSMs) when optimized with a disciplined memory policy and dynamic precision strategies. The goal is to simulate an "SSM Mamba" style workload under constrained me...	04-04 11:48	Success	-	View
exp_pytrain.20260404105755.014_20260404_105821 Paper: pytrain.20260404105755.014	Typed Plugin Registry with Metadata Parsing This benchmark tests the implementation of a strictly typed plugin system using Python's `typing.Protocol` and `typing.Generic`. It simulates a workflow where components are loaded dynamically based on a configuration dictionary (mimicking...	04-04 10:59	Success	-	View
exp_self.20260404103731.012_20260404_103758 Paper: self.20260404103731.012	SSM Strategy Stress Test This benchmark evaluates the efficacy of a disciplined memory management policy applied to State Space Model (SSM) workloads. Hypothesis Applying an SSM architecture with a disciplined memory policy (chunked execution) significantly reduces...	04-04 10:39	Success	-	View
exp_pytrain.20260404094229.013_20260404_094251 Paper: pytrain.20260404094229.013	Python Skill Fallback Title: Generic Kernel Dispatcher with Strict Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-04 09:43	Success	-	View
exp_self.20260404091740.011_20260404_091805 Paper: self.20260404091740.011	SSM Strategy Stress Test Benchmark This repository contains a self-contained benchmark designed to test the hypothesis that State Space Models (SSM) with a disciplined memory policy can achieve higher throughput and lower VRAM usage compared to standard attention-based b...	04-04 09:19	Success	-	View
exp_pytrain.20260404082157.012_20260404_082247 Paper: pytrain.20260404082157.012	Strictly-Typed Dynamic Plugin Registry Benchmark Overview This benchmark evaluates the implementation of a robust, type-safe plugin system utilizing Python's `typing.Protocol` and `importlib` features. It simulates an environment where plugin classes are discovered dynamically (mimicking...	04-04 08:23	Success	-	View
exp_self.20260404080148.010_20260404_080210 Paper: self.20260404080148.010	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the impact of a disciplined memory management policy on State Space Model (SSM) inference, specifically targeting throughput and VRAM constraints under 8GB. Objective To validate the hypothesis that applying strict...	04-04 08:03	Success	-	View
exp_pytrain.20260404071349.011_20260404_071412 Paper: pytrain.20260404071349.011	Python Skill Fallback Title: Validated Package Scaffolder - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-04 07:15	Success	-	View
exp_self.20260404065349.009_20260404_065409 Paper: self.20260404065349.009	Self-directed benchmark: SSM Strategy Stress Test Overview This benchmark compares the memory efficiency and throughput of a standard Transformer-style Attention mechanism against an optimized State Space Model (SSM) implementation. The hypothesis is that the SSM strategy, which utilizes a...	04-04 06:55	Success	-	View
exp_pytrain.20260404055919.010_20260404_055945 Paper: pytrain.20260404055919.010	Python Skill Fallback Title: Strictly Typed Async Batch Processor Module - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-04 06:00	Success	-	View
exp_self.20260404053814.008_20260404_053850 Paper: self.20260404053814.008	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260404053814.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-04 05:39	Success	-	View
exp_pytrain.20260404044427.009_20260404_044455 Paper: pytrain.20260404044427.009	Self-Contained Modular Report Generator This benchmark is designed to validate a Python engineer's ability to create a production-grade, self-contained module architecture within a single file. Hypothesis An autonomous coding system can simulate production-grade package architect...	04-04 04:45	Success	-	View
exp_self.20260404042146.007_20260404_042210 Paper: self.20260404042146.007	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260404042146.007 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-04 04:23	Success	-	View
exp_pytrain.20260404033417.008_20260404_033439 Paper: pytrain.20260404033417.008	Python Skill Fallback Title: Generic Repository Pattern with Modern Packaging - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-04 03:35	Success	-	View
exp_self.20260404031350.006_20260404_031429 Paper: self.20260404031350.006	Self-directed benchmark: ssm strategy stress test Overview This benchmark evaluates the efficiency of State Space Models (SSMs) under constrained memory environments. Specifically, it tests the hypothesis that applying an SSM with a disciplined memory policy (encompassing dynamic precision...	04-04 03:15	Success	-	View
exp_pytrain.20260404022416.007_20260404_022435 Paper: pytrain.20260404022416.007	Type-Safe Dynamic Component Instantiation Benchmark This benchmark tests the ability to implement a generic factory pattern commonly used in large-scale AI frameworks (like PyTorch or LitGPT) where model architectures are defined via string paths. Objective Implement a robust system to: 1. D...	04-04 02:25	Success	-	View
exp_self.20260404020402.005_20260404_020428 Paper: self.20260404020402.005	SSM Strategy Stress Test This benchmark evaluates the hypothesis that State Space Models (SSM) with a disciplined memory policy provide higher throughput and lower VRAM usage compared to standard Attention mechanisms under constrained memory environments (8GB l...	04-04 02:05	Success	-	View
exp_pytrain.20260404011543.006_20260404_011608 Paper: pytrain.20260404011543.006	Type-Safe Auto-Registering Model Registry Benchmark This benchmark evaluates a Python-centric architecture pattern designed to simplify the management of complex ML pipelines (e.g., Diffusers, vLLM). By leveraging `__init_subclass__` and `typing.Protocol`, we eliminate boilerplate code assoc...	04-04 01:17	Success	-	View
exp_gh_VectorInstitute_odyssey_20260404_010257 Paper: gh_VectorInstitute_odyssey	VectorInstitute/odyssey Paper ID: gh_VectorInstitute_odyssey - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recover...	04-04 01:03	Success	-	View
exp_pytrain.20260404004118.005_20260404_004137 Paper: pytrain.20260404004118.005	Strictly-Typed Dynamic Module Loader Benchmark This benchmark tests the ability to construct a secure, type-checked plugin system using only the Python standard library. The program dynamically creates a Python package on the fly, defines a strict `Protocol` interface, and utilizes `imp...	04-04 00:42	Success	-	View
exp_self.20260404002107.004_20260404_002132 Paper: self.20260404002107.004	Self-directed SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that a Selective State Space Model (SSM) implementation, adhering to a disciplined memory policy, improves throughput and reduces VRAM overhead compared to standard Transformer attention mechanisms un...	04-04 00:22	Success	-	View
exp_pytrain.20260403233224.004_20260403_233254 Paper: pytrain.20260403233224.004	Strict Distribution Metadata Introspector Overview This benchmark validates the ability of an autonomous system to programmatically inspect installed Python distributions using the standard library `importlib.metadata` module. It enforces structural integrity of the extracted data...	04-03 23:33	Success	-	View
exp_self.20260403231218.003_20260403_231249 Paper: self.20260403231218.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260403231218.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-03 23:13	Success	-	View
exp_pytrain.20260403222240.003_20260403_222317 Paper: pytrain.20260403222240.003	Dynamic Module Loader with Structural Subtyping Benchmark This benchmark tests the ability to design a robust runtime loader for modular components. It utilizes the `importlib` library for dynamic package introspection and the `typing.Protocol` system to enforce structural subtyping (duck typing w...	04-03 22:24	Success	-	View
exp_self.20260403220335.002_20260403_220359 Paper: self.20260403220335.002	SSM Strategy Stress Test Benchmark This benchmark evaluates the performance improvements gained by applying a disciplined memory policy to State Space Models (SSMs), specifically focusing on throughput and VRAM usage under constrained memory environments (8GB target). Object...	04-03 22:05	Success	-	View
exp_pytrain.20260403211944.002_20260403_212009 Paper: pytrain.20260403211944.002	PEP 695 Generic Dependency Resolver Benchmark This benchmark evaluates the developer experience and runtime characteristics of Python 3.12+'s new Type Parameter Syntax (PEP 695) by implementing a generic dependency resolution system. Objective Implement a lightweight package manager re...	04-03 21:21	Success	-	View
exp_self.20260403210039.001_20260403_210100 Paper: self.20260403210039.001	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying a disciplined memory policy and dynamic precision to State Space Models (SSMs) improves throughput under strict 8GB VRAM constraints. Methodology We simulate a Mamba-like SSM workload us...	04-03 21:02	Success	-	View
exp_pytrain.20260403201417.001_20260403_201451 Paper: pytrain.20260403201417.001	Structurally-Typed Plugin Loader Benchmark This benchmark validates a Python architecture that combines runtime dynamism with static structural typing. Overview The script demonstrates an autonomous plugin loading system. It uses `importlib` to dynamically discover and load modules...	04-03 20:15	Success	-	View
exp_self.20260403200346.012_20260403_200409 Paper: self.20260403200346.012	SSM Strategy Stress Test: Memory vs. Throughput This benchmark evaluates the hypothesis that a State Space Model (SSM) inference strategy (recurrent mode) significantly reduces VRAM usage compared to a standard Attention mechanism (Transformer baseline) under high sequence lengths, while...	04-03 20:04	Pending	-	View
exp_pytrain.20260403191447.023_20260403_191508 Paper: pytrain.20260403191447.023	Python Skill Fallback Title: Type-Safe Async Resource Pool with Internal Package Structure - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-03 19:16	Success	-	View
exp_oa_W7148177295_20260403_190329 Paper: oa_W7148177295	Video Generation Models as World Models: Efficient Paradigms, Architectures and Algorithms Paper ID: oa_W7148177295 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	04-03 19:04	Success	-	View
exp_pytrain.20260403184221.022_20260403_184240 Paper: pytrain.20260403184221.022	Generic Plugin Registry with PEP 695 Syntax This benchmark validates a Python engineer's ability to utilize modern type hinting features introduced in Python 3.12 (PEP 695) to create generic classes without external dependencies. It combines this with advanced standard library usage...	04-03 18:43	Success	-	View
exp_self.20260403182055.011_20260403_182129 Paper: self.20260403182055.011	Self-directed benchmark: ssm strategy stress test This repository contains a benchmark designed to test the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (specifically, dynamic precision casting) improves throughput and reduces VRAM usage under constra...	04-03 18:22	Success	-	View
exp_pytrain.20260403173545.021_20260403_173605 Paper: pytrain.20260403173545.021	Strictly-Typed Generic Pipeline Overview This benchmark demonstrates the creation of a strictly-typed data transformation pipeline using Python's standard typing utilities. The goal is to maintain type safety across a chain of operations, ensuring that static type checker...	04-03 17:37	Success	-	View
exp_self.20260403171455.010_20260403_171524 Paper: self.20260403171455.010	SSM Strategy Stress Test Benchmark This benchmark evaluates the performance of a State Space Model (SSM) implementation—specifically mimicking Mamba-style selective state spaces—under constrained memory conditions (8GB VRAM simulation). It compares a naive sequential recurre...	04-03 17:16	Success	-	View
exp_pytrain.20260403162737.020_20260403_162810 Paper: pytrain.20260403162737.020	Strictly-Typed Modular Pipeline with Exports Control This benchmark demonstrates the implementation of a robust, modular data pipeline using Python's standard `typing` module and strict module export controls. Design Principles 1. Structural Subtyping: Uses `typing.Protocol` to define int...	04-03 16:29	Success	-	View
exp_self.20260403160744.009_20260403_160806 Paper: self.20260403160744.009	SSM Strategy Stress Test Benchmark This repository contains a self-directed benchmark designed to test the hypothesis that State Space Models (SSM) with a disciplined memory policy (fixed state size) maintain higher throughput and lower VRAM usage than standard Attention...	04-03 16:09	Success	-	View
exp_pytrain.20260403152229.019_20260403_152320 Paper: pytrain.20260403152229.019	Strictly Typed Model Registry & Configuration Loader Overview This benchmark evaluates the implementation of a type-safe plugin architecture using Python's `typing` module. The system mimics the dependency injection patterns found in major ML frameworks like Hugging Face Transformers. Feature...	04-03 15:24	Success	-	View
exp_hf_2603.06679_20260403_151001 Paper: hf_2603.06679	MultiGen: External Memory Benchmark This benchmark evaluates the computational efficiency of the MultiGen architecture compared to standard next-frame diffusion baselines. Innovation Tested: The core hypothesis is that decomposing world simulation into Memory, **O...	04-03 15:11	Success	-	View
exp_pytrain.20260403144441.018_20260403_144522 Paper: pytrain.20260403144441.018	Dynamic Package Loader with Strict Protocol Validation This benchmark tests the engineering capability to design a robust plugin system that bridges Python's dynamic module loading with strict static typing. The goal is to implement a runtime validator that discovers modules dynamically (simula...	04-03 14:46	Success	-	View
exp_self.20260403142205.008_20260403_142249 Paper: self.20260403142205.008	SSM Strategy Stress Test Benchmark This repository contains a benchmark designed to evaluate the efficiency of State Space Model (SSM) architectures under constrained memory conditions (8GB VRAM limit). Objective The benchmark tests the hypothesis that applying **SSM with a...	04-03 14:24	Success	-	View
exp_pytrain.20260403132751.017_20260403_132816 Paper: pytrain.20260403132751.017	Python Skill Fallback Title: Dynamic Plugin Loader with Strict Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-03 13:29	Success	-	View
exp_self.20260403130624.007_20260403_130656 Paper: self.20260403130624.007	Self-directed SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that SSM (State Space Model) strategies significantly improve throughput and reduce VRAM overhead compared to standard Transformer architectures under strict memory constraints (8GB). Hyp...	04-03 13:08	Success	-	View
exp_pytrain.20260403121418.016_20260403_121445 Paper: pytrain.20260403121418.016	Strictly Typed Modular Plugin Loader Overview This coding drill benchmark tests your ability to design a strictly typed, modular plugin system within a single Python file. It leverages advanced type hinting features (`Protocol`, `TypedDict`, `TypeVar`, `overload`) to enforce s...	04-03 12:15	Success	-	View
exp_self.20260403114713.006_20260403_114803 Paper: self.20260403114713.006	Self-directed benchmark: ssm strategy stress test This repository contains a synthetic benchmark designed to test the hypothesis that applying State Space Models (SSM) with a disciplined memory policy improves throughput under 8GB VRAM constraints. Overview The benchmark compares two appro...	04-03 11:49	Success	-	View
exp_pytrain.20260403104743.015_20260403_104811 Paper: pytrain.20260403104743.015	Generic Event Dispatcher with Modern Type Syntax This benchmark implements a thread-safe Generic Event Dispatcher utilizing Python 3.12+ syntax (PEP 695) to define type parameters. It evaluates runtime performance and memory overhead while maintaining strict type hygiene.	04-03 10:49	Success	-	View
exp_self.20260403102243.005_20260403_102314 Paper: self.20260403102243.005	Self-directed benchmark: ssm strategy stress test This benchmark evaluates the efficiency of State Space Models (SSM) compared to standard Attention mechanisms under constrained memory environments. Hypothesis Applying SSM with disciplined memory policy improves throughput under 8GB constr...	04-03 10:24	Success	-	View
exp_pytrain.20260403092847.014_20260403_092910 Paper: pytrain.20260403092847.014	Type-Safe Plugin Registry Benchmark Objective Design a robust, modular component registry using Python's `typing.Protocol` and generic types (`typing.Generic`, `typing.TypeVar`). This benchmark simulates the internal architecture of scalable systems like LitGPT, ensuring loos...	04-03 09:30	Success	-	View
exp_self.20260403090347.004_20260403_090407 Paper: self.20260403090347.004	SSM Strategy Stress Test This benchmark evaluates the memory efficiency and throughput of a State Space Model (SSM) architecture compared to a traditional Transformer architecture under strict VRAM constraints (8GB). Concept The test compares a standard **Transform...	04-03 09:05	Success	-	View
exp_pytrain.20260403080654.013_20260403_080727 Paper: pytrain.20260403080654.013	Python Skill Fallback Title: Runtime Module Loader with Strict Type Verification - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-03 08:08	Success	-	View
exp_self.20260403074228.003_20260403_074249 Paper: self.20260403074228.003	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying a State Space Model (SSM) strategy with a disciplined memory policy (specifically chunked processing and mixed precision) improves throughput and memory efficiency under strict 8GB VRAM...	04-03 07:43	Success	-	View
exp_pytrain.20260403065322.012_20260403_065356 Paper: pytrain.20260403065322.012	Python Skill Fallback Title: Structural Subtyping Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-03 06:54	Success	-	View
exp_hf_2604.01152_20260403_063908 Paper: hf_2604.01152	Brainstacks: Modular Continual Learning Benchmark This benchmark validates the Brainstacks architecture, focusing on its ability to learn new domains sequentially (continual learning) without catastrophic forgetting, using frozen MoE-LoRA stacks. Key Innovations Validated 1. **Frozen S...	04-03 06:40	Success	-	View
exp_pytrain.20260403061445.011_20260403_061505 Paper: pytrain.20260403061445.011	Strictly Typed Source Distribution Builder This benchmark tests the ability to generate a standards-compliant Python package structure programmatically using only the standard library. Objective Create a script that demonstrates proficiency with: 1. Strict Typing: Utilizing `typ...	04-03 06:16	Success	-	View
exp_cr_10.1038_s41598-026-44804-x_20260403_055828 Paper: cr_10.1038_s41598-026-44804-x	Mamba-based modulated fusion model for video moment retrieval Paper ID: cr_10.1038_s41598-026-44804-x - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Reco...	04-03 05:59	Success	-	View
exp_pytrain.20260403053129.010_20260403_053201 Paper: pytrain.20260403053129.010	Robust Typed Configuration Module This benchmark evaluates a Python module's ability to strictly enforce type safety and adhere to packaging hygiene standards using only the standard library. Objective The goal is to simulate a high-integrity configuration loader typically...	04-03 05:33	Success	-	View
exp_pytrain.20260403045845.009_20260403_045949 Paper: pytrain.20260403045845.009	Python Skill Fallback Title: Robust Dynamic Plugin Loader with Structural Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-03 05:00	Success	-	View
exp_pytrain.20260403042335.008_20260403_042428 Paper: pytrain.20260403042335.008	Robust Package Dependency Resolver This benchmark evaluates the implementation of a `DependencyResolver` class designed to manage package installation order and detect conflicts using Python's standard library. Implementation Details The `DependencyResolver` class uses `grap...	04-03 04:25	Success	-	View
exp_pytrain.20260403035100.007_20260403_035125 Paper: pytrain.20260403035100.007	Generic Registry with Protocol-Based Plugin Loading This coding drill verifies the capability of an autonomous coding system to construct a robust, type-safe package architecture using only the Python Standard Library. Architecture Overview This benchmark creates a modular plugin architectur...	04-03 03:52	Success	-	View
exp_pytrain.20260403031621.006_20260403_031719 Paper: pytrain.20260403031621.006	Type-Safe Component Registry using Importlib This benchmark demonstrates a robust, extensible plugin architecture using Python's standard library. It leverages `typing.Protocol` for structural subtyping (duck typing) and `typing.Generic` to create a type-safe registry. It simulates a...	04-03 03:18	Success	-	View
exp_pytrain.20260403023931.005_20260403_024054 Paper: pytrain.20260403023931.005	Robust Dynamic Plugin Loader with Structural Subtyping Benchmark This benchmark evaluates a Python system's capability to dynamically discover, load, and validate plugins using structural subtyping (Protocols) rather than explicit inheritance. Design The script creates a secure, ephemeral package structu...	04-03 02:41	Success	-	View
exp_pytrain.20260403020308.004_20260403_020338 Paper: pytrain.20260403020308.004	Type-Safe Modular Log Filter Benchmark Overview This project demonstrates a robust, modular architecture for log filtering using Python's `typing.Protocol` for structural subtyping. It adheres to strict type safety standards and includes a built-in benchmark suite to validate pe...	04-03 02:04	Success	-	View
exp_self.20260403012816.002_20260403_012915 Paper: self.20260403012816.002	SSM Strategy Stress Test Benchmark This benchmark tests the hypothesis that applying SSM with disciplined memory policy improves throughput under 8GB constraints. Overview State Space Models (SSMs) like Mamba have shown impressive capabilities in sequence modeling while main...	04-03 01:30	Success	-	View
exp_pytrain.20260403001853.003_20260403_001911 Paper: pytrain.20260403001853.003	Generic Plugin Registry Benchmark This benchmark evaluates a system's ability to dynamically construct a Python package architecture at runtime, enforce structural typing via `typing.Protocol`, and manage module lifecycles using `importlib`. Scenario The script simulates a...	04-03 00:20	Success	-	View
exp_self.20260402234808.001_20260402_234843 Paper: self.20260402234808.001	Self-Directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that a disciplined memory policy applied to State Space Models (SSMs) improves throughput under constrained VRAM (8GB). The Innovation Standard large language models and naive SSM implementations...	04-02 23:49	Success	-	View
exp_pytrain.20260402224511.002_20260402_224608 Paper: pytrain.20260402224511.002	Generic Plugin Loader & PEP 695 Syntax Benchmark This benchmark evaluates the implementation of a type-safe, generic plugin architecture using Python 3.12's new Type Parameter Syntax (PEP 695). It demonstrates how modern generic syntax (`class MyClass[T]:`) improves code readability over...	04-02 22:47	Success	-	View
exp_pytrain.20260402221115.001_20260402_221156 Paper: pytrain.20260402221115.001	Dynamic Plugin Loader with Strict Protocol Enforcement This benchmark evaluates a system's ability to programmatically construct a Python package in a volatile file system environment and enforce strict type protocols using Python's standard `typing` module. Objective The candidate script must...	04-02 22:12	Success	-	View
exp_self.20260402215337.004_20260402_215404 Paper: self.20260402215337.004	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying a State Space Model (SSM) strategy with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to standard Transformer architectures. Requirements - Python 3...	04-02 21:54	Pending	-	View
exp_pytrain.20260402205031.006_20260402_205102 Paper: pytrain.20260402205031.006	Strictly-Typed Dynamic Component Loader Objective This benchmark challenges you to implement a robust, plugin-like architecture in Python without relying on external frameworks. The goal is to mimic the dynamic loading patterns used in large-scale ML libraries (like vLLM or Huggi...	04-02 20:52	Success	-	View
exp_gh_quic_aimet_20260402_203724 Paper: gh_quic_aimet	AIMET Quantization Benchmark This benchmark evaluates the efficiency of AIMET (AI Model Efficiency Toolkit) for Post-Training Quantization (PTQ). It measures VRAM usage and inference throughput (tokens/sec) of a standard Transformer model before and after applying...	04-02 20:38	Success	-	View
exp_pytrain.20260402201405.005_20260402_201435 Paper: pytrain.20260402201405.005	Runtime-Verified ZipApp Packager This benchmark evaluates an autonomous coding system's ability to programmatically synthesize a Python package structure, enforce strict type compliance on the generated source code using runtime introspection (without external linters), an...	04-02 20:15	Success	-	View
exp_self.20260402195336.003_20260402_195359 Paper: self.20260402195336.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260402195336.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	04-02 19:55	Success	-	View
exp_pytrain.20260402190028.004_20260402_190049 Paper: pytrain.20260402190028.004	Strictly Typed CLI Log Processor This coding drill benchmarks the ability to write a robust, strictly-typed Python CLI application using only the standard library. Overview The script `benchmark.py` implements a log processor that: 1. Parses Arguments: Uses `argparse`...	04-02 19:01	Success	-	View
exp_self.20260402183757.002_20260402_183822 Paper: self.20260402183757.002	Self-Directed SSM Strategy Stress Test This benchmark evaluates the performance characteristics of a novel State Space Model (SSM) strategy designed for memory-constrained environments (8GB VRAM limit). The Innovation The proposed method integrates two key optimizations: 1. **Dy...	04-02 18:39	Success	-	View
exp_pytrain.20260402174528.003_20260402_174548 Paper: pytrain.20260402174528.003	Python Skill Fallback Title: Strictly Typed Dynamic Plugin System - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-02 17:46	Success	-	View
exp_self.20260402172433.001_20260402_172454 Paper: self.20260402172433.001	SSM Strategy Stress Test Benchmark This benchmark evaluates the efficiency of State Space Models (SSMs) against standard Attention mechanisms under strict memory constraints. It simulates an 8GB VRAM environment by tracking peak memory allocation and throughput for long-cont...	04-02 17:26	Success	-	View
exp_pytrain.20260402163435.002_20260402_163503 Paper: pytrain.20260402163435.002	Benchmark: Modern Generic Cache Manager with PEP 695 This coding drill validates the implementation of a generic `LRUCache` class utilizing the new PEP 695 Type Parameter Syntax introduced in Python 3.12. The objective is to ensure the codebase leverages modern typing features for improved re...	04-02 16:36	Success	-	View
exp_2604.01216v1_20260402_162258 Paper: 2604.01216v1	Benchmark for LAPIS-SHRED This benchmark evaluates the computational performance and reconstruction capability of the LAPIS-SHRED (LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders) architecture. Architecture Overview LAPIS-SHRED is d...	04-02 16:24	Success	-	View
exp_pytrain.20260402160154.001_20260402_160220 Paper: pytrain.20260402160154.001	Structural Subtyping Plugin Loader Benchmark Overview This benchmark tests the ability to construct a robust, type-safe plugin loading system using Python's `typing.Protocol` and `importlib`. The goal is to discover modules within a package structure, instantiate classes that structur...	04-02 16:03	Success	-	View
exp_self.20260402154924.002_20260402_154959 Paper: self.20260402154924.002	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that State Space Models (SSMs) with disciplined memory policies (specifically Mamba) offer superior throughput and memory efficiency compared to standard Transformer architectures under strict 8GB VRA...	04-02 15:49	Pending	-	View
exp_pytrain.20260402145859.013_20260402_145918 Paper: pytrain.20260402145859.013	Python Skill Fallback Title: Type-Safe Kernel Dispatcher with Package Semantics - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-02 15:00	Success	-	View
exp_self.20260402143425.001_20260402_143531 Paper: self.20260402143425.001	SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the performance of State Space Models (SSM) under memory constraints. It specifically tests the hypothesis that applying SSM with a disciplined memory policy improves throughput under 8GB VRAM constraints....	04-02 14:37	Success	-	View
exp_pytrain.20260402134712.012_20260402_134741 Paper: pytrain.20260402134712.012	Python Skill Fallback Title: Dynamic Plugin Registry with Structural Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-02 13:48	Success	-	View
exp_2604.01220v1_20260402_133500 Paper: 2604.01220v1	Universal YOCO for Efficient Depth Scaling Paper ID: 2604.01220v1 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	04-02 13:36	Success	-	View
exp_oa_W4413304852_20260402_132320 Paper: oa_W4413304852	Benchmark: LLM Optimization for PHM on Edge Devices Paper: Large language models for PHM: a review of optimization techniques and applications Type: Review This paper surveys LLM deployment strategies for Prognostics and Health Management (PHM) on resource-constrained industrial hard...	04-02 13:24	Success	-	View
exp_pytrain.20260402130329.011_20260402_130359 Paper: pytrain.20260402130329.011	Python Skill Fallback Title: Dynamic Type-Safe Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-02 13:05	Success	-	View
exp_2411.02985v1_20260402_125300 Paper: 2411.02985v1	Benchmark: Hybrid Sparse Coding with Unrolled Solver Architecture: Hybrid sparse coding model utilizing a concatenated dictionary (Zernike polynomials + complex modes) and a trainable affine transform layer. Inference relies on $L_1$-regularized optimization (sparse recovery) rather than...	04-02 12:54	Success	-	View
exp_pytrain.20260402122944.010_20260402_123039 Paper: pytrain.20260402122944.010	Strictly Typed Async Plugin System This benchmark evaluates a Python plugin architecture that leverages Structural Subtyping (Protocol) and Generics to enforce type safety without explicit inheritance. Objective The goal is to design an asynchronous data processor re...	04-02 12:31	Success	-	View
exp_cr_10.1016_j.aiig.2024.100104_20260402_121708 Paper: cr_10.1016_j.aiig.2024.100104	Convolutional Sparse Coding (CSC) Benchmark Architecture: Proposes a feed-forward Convolutional Sparse Coding (CSC) network designed to replace iterative optimization algorithms. The structure typically utilizes cascaded convolutional layers coupled with non-linear shrinkage...	04-02 12:18	Success	-	View
exp_pytrain.20260402115233.009_20260402_115315 Paper: pytrain.20260402115233.009	Python Skill Fallback Title: Strict Project Metadata Auditor - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-02 11:54	Success	-	View
exp_2603.26465v1_20260402_114107 Paper: 2603.26465v1	Backfill Candidate 2603.26465v1 Architecture: A hybrid model enhancing standard Transformers with Boltzmann Machine constraints. It integrates structured binary gating variables into multi-head attention to model higher-order dependencies, utilizing mean-field variati...	04-02 11:42	Success	-	View
exp_pytrain.20260402111105.008_20260402_111142 Paper: pytrain.20260402111105.008	Python Skill Fallback Title: Strictly-Typed Project Scaffolder - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-02 11:12	Success	-	View
exp_2411.01399v1_20260402_105727 Paper: 2411.01399v1	MambaReg Benchmark: Linear vs. Quadratic Complexity Architecture: MambaReg introduces a hybrid architecture combining Convolutional Neural Networks (CNNs) with Mamba (State Space Models). It extracts local features via convolutions and processes global context via Mamba blocks to handle...	04-02 10:58	Success	-	View
exp_pytrain.20260402102936.007_20260402_103014 Paper: pytrain.20260402102936.007	Strict Metadata Validator and Plugin Loader This drill validates the hypothesis that leveraging Python's structural typing features (`TypedDict`, `Protocol`) alongside `importlib` creates a robust, self-documenting plugin architecture. By defining strict interfaces for metadata and e...	04-02 10:31	Success	-	View
exp_2603.25722v1_20260402_101515 Paper: 2603.25722v1	Benchmark: Parameter-Free Cross-Modal Attention Pooling Architecture: Modifies standard dual-encoder (Contrastive V&L) frameworks. Replaces final global pooling with parameter-free cross-modal attention-pooling to align concept-centric text segments with visual features. **Memory Footpri...	04-02 10:16	Success	-	View
exp_pytrain.20260402094822.006_20260402_094852 Paper: pytrain.20260402094822.006	Python Skill Fallback Title: Runtime Type-Checked Plugin Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-02 09:49	Success	-	View
exp_2410.18794v2_20260402_093622 Paper: 2410.18794v2	Backfill Candidate 2410.18794v2 Architecture: Hybrid model integrating a lightweight "predictor network" (CNN) with a hard-thresholded Convolutional Locally Competitive Algorithm (LCA) solver. The predictor performs "state warm-up," generating a high-quality initial g...	04-02 09:37	Success	-	View
exp_pytrain.20260402090520.005_20260402_090618 Paper: pytrain.20260402090520.005	Generic Plugin Registry with Typed Configuration This benchmark implements a standalone `cli_engine` simulation. It demonstrates advanced type safety features in Python standard library including `Protocol`, `Generic`, `TypeVar`, and `TypedDict`. Architecture 1. TypedDict (`Settings`)...	04-02 09:07	Success	-	View
exp_hf_2603.13904_20260402_085033 Paper: hf_2603.13904	Benchmark for CroBo: Single-Token Visual State Compression Paper: CroBo (Visual States Need What-is-Where Composition) Architecture: CroBo is a self-supervised encoder-decoder framework designed to compress visual observations into a single, compact bottleneck token capturing "what-is-w...	04-02 08:51	Success	-	View
exp_pytrain.20260402082637.004_20260402_082710 Paper: pytrain.20260402082637.004	Python Skill Fallback Title: Dynamic Type-Safe Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-02 08:28	Success	-	View
exp_cr_10.3390_pr13071977_20260402_081430 Paper: cr_10.3390_pr13071977	Backfill Candidate cr_10.3390_pr13071977 Architecture: TransQwen is a specialized fine-tune of Qwen-7B-Chat utilizing DoRA (Weight-Decomposed Low-Rank Adaptation) for parameter-efficient updates and RoPE for positional encoding. This is a **weight-based learning approa...	04-02 08:15	Success	-	View
exp_pytrain.20260402074913.003_20260402_074940 Paper: pytrain.20260402074913.003	Protocol-Driven Extensible CLI Dispatcher This benchmark tests the implementation of a modular command-line interface (CLI) using Python's `typing.Protocol` for structural sub-typing. Objectives 1. Protocol Enforcement: Define a `Command` interface using `typing.Protocol` and `...	04-02 07:50	Success	-	View
exp_2412.00503v3_20260402_073338 Paper: 2412.00503v3	Benchmarking Bio-Plausible Transformers (RFB-kWTA) Architecture: The paper proposes integrating biological homeostasis mechanisms—RFB-kWTA (Random Feedback k-Winners-Take-All) and "Smart" Inhibition—into standard Transformer attention and output layers. These modules use running statist...	04-02 07:34	Success	-	View
exp_pytrain.20260402070205.002_20260402_070257 Paper: pytrain.20260402070205.002	Python Reliability Drill: Typing & Robustness Overview This drill implements a Type-Safe Inference Engine to test your ability to write robust, reusable utilities with strict typing constraints, edge-case handling, and performance monitoring. Objective Create a generic processing u...	04-02 07:03	Success	-	View
exp_cr_10.3390_info16050343_20260402_064550 Paper: cr_10.3390_info16050343	Backfill Candidate cr_10.3390_info16050343 Architecture: Introduces CPSE (encoding) and CPSD (decoding), a framework utilizing Sparse Binary Representations (SDRs) and triadic memory. It extends Context-Dependent Thinning (CDT) to manage nested compositional structures a...	04-02 06:46	Success	-	View
exp_pytrain.20260402061631.001_20260402_061706 Paper: pytrain.20260402061631.001	Generic Plugin Loader with Strict Interface Contracts This benchmark evaluates an implementation of a modular data processing pipeline architecture. It utilizes Python's `typing.Protocol` to define structural subtyping (duck typing with explicit contracts) and `typing.Generic` for type-safe co...	04-02 06:18	Success	-	View
exp_pytrain.20260401075805.001_20260401_075856 Paper: pytrain.20260401075805.001	Runtime-Checked Plugin Loader This benchmark tests a developer's ability to design a robust, type-safe plugin architecture using Python's standard library. Problem Description Create a single-file Python script `benchmark.py` that implements a **Runtime-Checked Plugin L...	04-01 07:59	Success	-	View
exp_pytrain.20260401071752.001_20260401_071825 Paper: pytrain.20260401071752.001	Structural Subtyping Plugin Registry This benchmark simulates a modern plugin architecture where plugins are discovered dynamically and validated against a strict Structural Subtyping (Protocol) contract defined via PEP 544. It tests the ability to: 1. Define a strict `typ...	04-01 07:19	Success	-	View
exp_pytrain.20260401063316.091_20260401_063401 Paper: pytrain.20260401063316.091	Dynamic CLI Architecture with Strict Typing Objective Design a single-file executable Python script (`smart_cli.py`) that demonstrates advanced use of type hints (`Protocol`) and reflection (`importlib`) to build a modular command-line interface. The goal is to simulate the architect...	04-01 06:35	Success	-	View
exp_cr_10.3390_s25010064_20260401_061451 Paper: cr_10.3390_s25010064	Benchmark: Edge-Scale Driver Intent Model (Llama-3-8B + 4-bit) Architecture Built on Llama-3-8B-Instruct, optimized via LoRA to integrate multi-attribute inputs (historical interactions, driver emotion, vehicle/physics state). It functions as an encoder-decoder for intent prediction, treati...	04-01 06:15	Success	-	View
exp_pytrain.20260401054831.090_20260401_054910 Paper: pytrain.20260401054831.090	Python Skill Fallback Title: Asyncio ZipApp Packager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	04-01 05:50	Success	-	View
exp_2410.16443v4_20260401_053313 Paper: 2410.16443v4	Benchmark: CRATE (Coding RAte TransformEr) vs Standard Transformer Architecture: CRATE (Coding RAte TransformEr) is a "white-box" Transformer variant that explicitly integrates sparse coding mechanisms—specifically coding rate minimization—directly into the network layers to capture low-dimensional dat...	04-01 05:34	Success	-	View
exp_pytrain.20260401050134.089_20260401_050235 Paper: pytrain.20260401050134.089	Generic Plugin Registry with Type Safety Overview This benchmark evaluates a Python developer's ability to construct a robust, type-safe "micro-framework" within a single file. It simulates a modular package architecture by leveraging advanced `typing` constructs (Generics, Protoc...	04-01 05:03	Success	-	View
exp_pytrain.20260401042214.088_20260401_042307 Paper: pytrain.20260401042214.088	Typed Metadata Inspector (PEP 695) This benchmark validates a developer's ability to utilize modern Python 3.12+ type hinting features (PEP 695) in conjunction with the standard library's packaging tooling (`importlib.metadata`). Objective: Implement a Generic class `Pac...	04-01 04:24	Success	-	View
exp_pytrain.20260401032811.087_20260401_032935 Paper: pytrain.20260401032811.087	Type-Safe Dynamic Plugin Registry This benchmark tests the ability to design a robust, dynamic plugin system using Python's `typing.Protocol` and `importlib` modules. Scenario You are building an extensible data processing framework. You must define a strict `Transform` pro...	04-01 03:30	Success	-	View
exp_pytrain.20260401024404.086_20260401_024524 Paper: pytrain.20260401024404.086	Coding Drill: Generic Component Registry Objective Implement a robust, type-safe `Registry` class using Python's standard library. This pattern is common in large-scale ML frameworks (like Diffusers or vLLM) to manage dynamic model loading and configuration without hard-coding dep...	04-01 02:46	Success	-	View
exp_pytrain.20260401015913.085_20260401_020033 Paper: pytrain.20260401015913.085	Typed ZipApp Packager This benchmark tests the ability of an autonomous coding system to construct a lightweight distribution tool using Python's standard library. Objective The candidate must implement a `ZipAppBuilder` class that compiles a dictionary of virtu...	04-01 02:01	Success	-	View
exp_pytrain.20260401011808.084_20260401_011920 Paper: pytrain.20260401011808.084	Type-Safe Modular Data Processor A robust, single-file Python module demonstrating strict type integrity using Generics, Protocols, and modern packaging standards within the standard library. This benchmark simulates a high-throughput data ingestion pipeline. Features - **...	04-01 01:20	Success	-	View
exp_pytrain.20260401003943.083_20260401_004013 Paper: pytrain.20260401003943.083	Dynamic Plugin Loader with Runtime Type Validation Overview This coding drill benchmarks a Python system's ability to implement a secure, modular plugin architecture. It tests the hypothesis that an autonomous system can achieve robust modularity by programmatically generating Python module...	04-01 00:41	Success	-	View
exp_core_299002838_20260401_002015 Paper: core_299002838	Backfill Candidate core_299002838 This review surveys Transformer-based LLMs and multi-modal architectures for Prognostics and Health Management (PHM), specifically targeting deployment on resource-constrained industrial hardware. * Architecture: Focuses on adapting gen...	04-01 00:21	Success	-	View
exp_pytrain.20260331234921.082_20260331_234950 Paper: pytrain.20260331234921.082	PEP 695 Generic Repository Implementation Overview This benchmark evaluates a Python developer's ability to utilize PEP 695 Type Parameter Syntax (introduced in Python 3.12) to define generic classes and functions without relying on legacy `TypeVar` imports. Objective Implement a r...	03-31 23:50	Success	-	View
exp_2410.00340v3_20260331_233316 Paper: 2410.00340v3	Backfill Candidate 2410.00340v3 Assessment: Low Relevance for Inference Optimization Architecture: No new model architecture proposed. The paper introduces a diagnostic tool using Singular Value Decomposition (SVD) on GPT-2 Small’s attention weight matrices to iso...	03-31 23:34	Success	-	View
exp_pytrain.20260331231020.081_20260331_231044 Paper: pytrain.20260331231020.081	Python Typing & Structure Drill: Generic Plugin Registry This drill validates the implementation of a strictly typed, generic plugin system using Python's `typing.Protocol`, `typing.Generic`, and `typing.TypeVar`. It simulates a package structure within a single script by enforcing proper `__all_...	03-31 23:11	Success	-	View
exp_pytrain.20260331223703.080_20260331_223800 Paper: pytrain.20260331223703.080	Python Skill Fallback Title: Strictly-Typed Plugin Registry with Metadata Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-31 22:39	Success	-	View
exp_pytrain.20260331214749.079_20260331_214824 Paper: pytrain.20260331214749.079	Python Skill Fallback Title: Typing-Driven Model Registry Factory - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-31 21:49	Success	-	View
exp_pytrain.20260331211005.078_20260331_211154 Paper: pytrain.20260331211005.078	Type-Safe Dynamic Module Loader Benchmark This benchmark evaluates the ability to construct a robust, type-safe plugin architecture using Python's standard library. Objective The goal is to programmatically generate a Python package structure on disk, define a strict structural int...	03-31 21:12	Success	-	View
exp_2401.00243v1_20260331_204734 Paper: 2401.00243v1	UP-RLHF Policy Inference Benchmark Architecture: UP-RLHF introduces a training-time architecture utilizing an ensemble of diverse Low-Rank Adaptations (LoRAs) for the Reward Model (RM). Diversity is enforced by maximizing the nuclear norm of concatenated LoRA matrices. T...	03-31 20:48	Success	-	View
exp_pytrain.20260331201655.077_20260331_201740 Paper: pytrain.20260331201655.077	Strict Typed Plugin System Simulator Overview This coding drill evaluates the system's ability to construct a robust, modular application architecture using modern Python typing constructs (`Protocol`, `TypeVar`, `runtime_checkable`) and standard library introspection tools (`...	03-31 20:18	Success	-	View
exp_2508.16915v3_20260331_200034 Paper: 2508.16915v3	Benchmark for Candidate 2508.16915v3: Reinforcement-Guided Hyper-Heuristic SNN for Fraud Detection Fallback synthesis: Reinforcement-Guided Hyper-Heuristic Hyperparameter Optimization for Fair and Explainable Spiking Neural Network-Based Financial Fraud Detection. Potential 8GB relevance via sparse, rag.	03-31 20:01	Success	-	View
exp_pytrain.20260331192901.076_20260331_192932 Paper: pytrain.20260331192901.076	Generic Dependency Injection Container with Public API Hygiene Overview This benchmark evaluates your ability to construct a robust, type-safe dependency injection (DI) system using Python's standard type hints and packaging best practices. The goal is to create a `ServiceContainer` that manages object...	03-31 19:30	Success	-	View
exp_2507.10855v1_20260331_191835 Paper: 2507.10855v1	Backfill Candidate 2507.10855v1 Fallback synthesis: Sparse Fine-Tuning of Transformers for Generative Tasks. Potential 8GB relevance via sparse, rag.	03-31 19:19	Success	-	View
exp_cr_10.34088_kojose.1658929_20260331_190725 Paper: cr_10.34088_kojose.1658929	Backfill Candidate cr_10.34088_kojose.1658929 Fallback synthesis: Refining Sparse Coding Dictionaries Using High Dimensional Model Representation for Hyperspectral Imagery. Potential 8GB relevance via sparse, rag.	03-31 19:08	Success	-	View
exp_pytrain.20260331184729.075_20260331_184745 Paper: pytrain.20260331184729.075	Generic Command Registry Benchmark This benchmark tests the creation of a robust, extensible command processing pipeline leveraging Python's advanced typing features (Generics and Protocols) and strict packaging standards within a single-file constraint. Objective Develop a...	03-31 18:48	Success	-	View
exp_cr_10.7717_peerj-cs.3388_20260331_183704 Paper: cr_10.7717_peerj-cs.3388	Benchmark: Sparse CNN Efficiency via Feature Decoupling Fallback synthesis: Towards optimal sparse CNNs: sparsity-friendly knowledge distillation through feature decoupling. Potential 8GB relevance via sparse.	03-31 18:38	Success	-	View
exp_2411.04519v2_20260331_182547 Paper: 2411.04519v2	FNet-LZSC: Deep Unfolding Sparse Coding Benchmark Architecture: FNet utilizes Deep Unfolding of an $\ell_0$-regularized Multi-Modal Convolutional Sparse Coding (MCSC) model. The core component is the Learnable $\ell_0$ Sparse Coding (LZSC) block, which explicitly decomposes sou...	03-31 18:26	Success	-	View
exp_pytrain.20260331180541.074_20260331_180614 Paper: pytrain.20260331180541.074	Generic Component Registry & CLI Benchmark This benchmark evaluates the implementation of a type-safe, generic component registry within a strict packaging structure, mimicking the architecture of frameworks like LitGPT. Objectives 1. Packaging Structure: Correctly define module...	03-31 18:07	Success	-	View
exp_2411.00393v4_20260331_175509 Paper: 2411.00393v4	Backfill Candidate 2411.00393v4 Architecture: Replaces scalar regression or one-hot classification layers with population-coded layers. In this scheme, a continuous variable is represented by a distributed activation pattern across a neuron ensemble, mimicking bio...	03-31 17:56	Success	-	View
exp_cr_10.1609_aaai.v40i42.40891_20260331_174359 Paper: cr_10.1609_aaai.v40i42.40891	Benchmark: ToT (Test of Time) Framework for Multimodal LLMs Architecture: ToT is a model-agnostic, inference-time framework for Multimodal LLMs. It operates as a non-invasive "black-box" wrapper, detecting backdoors by analyzing semantic consistency and confidence drift in response to controlled...	03-31 17:45	Success	-	View
exp_pytrain.20260331172420.073_20260331_172442 Paper: pytrain.20260331172420.073	Python Skill Fallback Title: Strictly-Typed Plugin Registry with Dependency Verification - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-31 17:25	Success	-	View
exp_2507.07136v2_20260331_171339 Paper: 2507.07136v2	Benchmark: LangSplatV2 High-Dimensional Language Splatting Fallback synthesis: LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS. Potential 8GB relevance via sparse, inference, rag.	03-31 17:14	Success	-	View
exp_pytrain.20260331165133.072_20260331_165155 Paper: pytrain.20260331165133.072	Python Skill Fallback Title: Strictly Typed Plugin Discovery and Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-31 16:52	Success	-	View
exp_2506.24041v1_20260331_163915 Paper: 2506.24041v1	Backfill Candidate 2506.24041v1 Fallback synthesis: Unsupervised Sparse Coding-based Spiking Neural Network for Real-time Spike Sorting. Potential 8GB relevance via sparse, inference, rag.	03-31 16:40	Success	-	View
exp_pytrain.20260331161811.071_20260331_161843 Paper: pytrain.20260331161811.071	Python Skill Fallback Title: Dynamic Type-Checked Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-31 16:19	Success	-	View
exp_cr_10.61091_jcmcc127a-423_20260331_160544 Paper: cr_10.61091_jcmcc127a-423	Polynomial Matrix Sparse Coding (PMSC) Benchmark Summary for ARES 8GB Roadmap Architecture: The paper proposes a Polynomial Matrix Sparse Coding (PMSC) framework. This is a mathematical approach to signal feature extraction (specifically for non-electrical signals in HVDC valves),...	03-31 16:06	Success	-	View
exp_pytrain.20260331153946.070_20260331_154018 Paper: pytrain.20260331153946.070	Generic Repository with Encapsulated API This benchmark evaluates your ability to construct a robust, type-safe data access layer using Python's advanced typing features (Generics, Protocols) and packaging standards (`__all__`). Objective Implement a Generic Repository pattern wit...	03-31 15:41	Success	-	View
exp_hf_2603.24793_20260331_152843 Paper: hf_2603.24793	AVControl Benchmark: Modular LoRA Injection for LTX-2 Architecture AVControl is a modular framework built on the LTX-2 DiT architecture. It employs a "parallel canvas" mechanism, injecting control modalities (e.g., depth, pose, audio) as additional tokens within attention layers. Each cont...	03-31 15:30	Success	-	View
exp_pytrain.20260331150644.069_20260331_150719 Paper: pytrain.20260331150644.069	Dynamic Typed Package Construction and Verification Overview This benchmark evaluates an autonomous coding system's ability to programmatically generate a valid Python package structure, enforce strict type annotations, manage module visibility, and perform runtime introspection using the st...	03-31 15:08	Success	-	View
exp_2412.08516v2_20260331_145530 Paper: 2412.08516v2	Hybrid Offline Feature Selection for Recommender Systems Architecture: Hybrid offline feature selection pipeline. LLMs provide semantic reasoning to rank feature importance, followed by a lightweight surrogate model that refines these rankings for task-specific optimization. **Memory Footprin...	03-31 14:56	Success	-	View
exp_oa_W4417147545_20260331_144443 Paper: oa_W4417147545	Benchmark: Edge Deployment Optimization for MLLMs Summary for ARES 8GB Roadmap This survey provides a systematic review of optimization strategies for Multimodal Large Language Models (MLLMs), specifically targeting edge deployment constraints relevant to 8GB VRAM limitations. * **Arch...	03-31 14:45	Success	-	View
exp_pytrain.20260331142553.068_20260331_142629 Paper: pytrain.20260331142553.068	Typed PyProject Manifest Validator This benchmark tests the hypothesis that utilizing PEP 484 Type Hints and TypedDicts to model packaging configuration data reduces runtime errors and improves the maintainability of configuration parsers. Objective To create a robust valida...	03-31 14:27	Success	-	View
exp_2603.25720v1_20260331_142303 Paper: 2603.25720v1	R-C2 Benchmark: Cycle-Consistency Latency Overhead Architecture: R-C2 is a Reinforcement Learning (RL) framework designed for Vision-Language Models (VLMs). It enforces a "cycle-consistency" constraint, utilizing backward inference (Answer $\to$ Reconstruction) and modality switching to...	03-31 14:24	Success	-	View
exp_oa_W4413304852_20260331_141203 Paper: oa_W4413304852	Backfill Candidate oa_W4413304852 Paper: Large language models for PHM: a review of optimization techniques and applications Type: Review This paper surveys LLM deployment strategies for Prognostics and Health Management (PHM) on resource-constrained industrial hard...	03-31 14:13	Success	-	View
exp_pytrain.20260331135330.067_20260331_135405 Paper: pytrain.20260331135330.067	Python Skill Fallback Title: Dynamic Component Loader with Runtime Type Verification - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-31 13:55	Success	-	View
exp_pytrain.20260331135202.066_20260331_135231 Paper: pytrain.20260331135202.066	pytrain.20260331135202.066 No summary available yet.	03-31 13:52	Pending	-	View
exp_pytrain.20260331134857.065_20260331_134932 Paper: pytrain.20260331134857.065	pytrain.20260331134857.065 No summary available yet.	03-31 13:49	Pending	-	View
exp_pytrain.20260331134714.064_20260331_134807 Paper: pytrain.20260331134714.064	pytrain.20260331134714.064 No summary available yet.	03-31 13:48	Pending	-	View
exp_pytrain.20260331134412.063_20260331_134502 Paper: pytrain.20260331134412.063	pytrain.20260331134412.063 No summary available yet.	03-31 13:45	Pending	-	View
exp_pytrain.20260331134229.062_20260331_134259 Paper: pytrain.20260331134229.062	pytrain.20260331134229.062 No summary available yet.	03-31 13:42	Pending	-	View
exp_pytrain.20260331133919.061_20260331_134008 Paper: pytrain.20260331133919.061	pytrain.20260331133919.061 No summary available yet.	03-31 13:40	Pending	-	View
exp_pytrain.20260331133740.060_20260331_133818 Paper: pytrain.20260331133740.060	pytrain.20260331133740.060 No summary available yet.	03-31 13:38	Pending	-	View
exp_pytrain.20260331133401.059_20260331_133508 Paper: pytrain.20260331133401.059	pytrain.20260331133401.059 No summary available yet.	03-31 13:35	Pending	-	View
exp_pytrain.20260331133157.058_20260331_133251 Paper: pytrain.20260331133157.058	pytrain.20260331133157.058 No summary available yet.	03-31 13:32	Pending	-	View
exp_pytrain.20260331132843.057_20260331_132918 Paper: pytrain.20260331132843.057	pytrain.20260331132843.057 No summary available yet.	03-31 13:29	Pending	-	View
exp_pytrain.20260331132714.056_20260331_132746 Paper: pytrain.20260331132714.056	pytrain.20260331132714.056 No summary available yet.	03-31 13:27	Pending	-	View
exp_pytrain.20260331132557.055_20260331_132627 Paper: pytrain.20260331132557.055	pytrain.20260331132557.055 No summary available yet.	03-31 13:26	Pending	-	View
exp_pytrain.20260331132321.054_20260331_132356 Paper: pytrain.20260331132321.054	pytrain.20260331132321.054 No summary available yet.	03-31 13:23	Pending	-	View
exp_pytrain.20260331132203.053_20260331_132233 Paper: pytrain.20260331132203.053	pytrain.20260331132203.053 No summary available yet.	03-31 13:22	Pending	-	View
exp_pytrain.20260331131910.052_20260331_131951 Paper: pytrain.20260331131910.052	pytrain.20260331131910.052 No summary available yet.	03-31 13:19	Pending	-	View
exp_pytrain.20260331131751.051_20260331_131810 Paper: pytrain.20260331131751.051	pytrain.20260331131751.051 No summary available yet.	03-31 13:18	Pending	-	View
exp_pytrain.20260331131459.050_20260331_131536 Paper: pytrain.20260331131459.050	pytrain.20260331131459.050 No summary available yet.	03-31 13:15	Pending	-	View
exp_pytrain.20260331131325.049_20260331_131407 Paper: pytrain.20260331131325.049	pytrain.20260331131325.049 No summary available yet.	03-31 13:14	Pending	-	View
exp_pytrain.20260331131045.048_20260331_131109 Paper: pytrain.20260331131045.048	pytrain.20260331131045.048 No summary available yet.	03-31 13:11	Pending	-	View
exp_pytrain.20260331130909.047_20260331_130946 Paper: pytrain.20260331130909.047	pytrain.20260331130909.047 No summary available yet.	03-31 13:09	Pending	-	View
exp_pytrain.20260331130611.046_20260331_130640 Paper: pytrain.20260331130611.046	pytrain.20260331130611.046 No summary available yet.	03-31 13:06	Pending	-	View
exp_pytrain.20260331130428.045_20260331_130517 Paper: pytrain.20260331130428.045	pytrain.20260331130428.045 No summary available yet.	03-31 13:05	Pending	-	View
exp_pytrain.20260331130145.044_20260331_130220 Paper: pytrain.20260331130145.044	pytrain.20260331130145.044 No summary available yet.	03-31 13:02	Pending	-	View
exp_pytrain.20260331130013.043_20260331_130054 Paper: pytrain.20260331130013.043	pytrain.20260331130013.043 No summary available yet.	03-31 13:00	Pending	-	View
exp_pytrain.20260331125903.042_20260331_125929 Paper: pytrain.20260331125903.042	pytrain.20260331125903.042 No summary available yet.	03-31 12:59	Pending	-	View
exp_pytrain.20260331125629.041_20260331_125702 Paper: pytrain.20260331125629.041	pytrain.20260331125629.041 No summary available yet.	03-31 12:57	Pending	-	View
exp_pytrain.20260331125444.040_20260331_125515 Paper: pytrain.20260331125444.040	pytrain.20260331125444.040 No summary available yet.	03-31 12:55	Pending	-	View
exp_pytrain.20260331125204.039_20260331_125238 Paper: pytrain.20260331125204.039	pytrain.20260331125204.039 No summary available yet.	03-31 12:52	Pending	-	View
exp_2411.02985v1_20260331_125051 Paper: 2411.02985v1	2411.02985v1 Architecture: Hybrid sparse coding model utilizing a concatenated dictionary (Zernike polynomials + complex modes) and a trainable affine transform layer. Inference relies on $L_1$-regularized optimization (sparse recovery) rather than...	03-31 12:50	Pending	-	View
exp_cr_10.1016_j.aiig.2024.100104_20260331_124955 Paper: cr_10.1016_j.aiig.2024.100104	cr_10.1016_j.aiig.2024.100104 Architecture: Proposes a feed-forward Convolutional Sparse Coding (CSC) network designed to replace iterative optimization algorithms. The structure typically utilizes cascaded convolutional layers coupled with non-linear shrinkage...	03-31 12:49	Pending	-	View
exp_2603.26465v1_20260331_124906 Paper: 2603.26465v1	2603.26465v1 Architecture: A hybrid model enhancing standard Transformers with Boltzmann Machine constraints. It integrates structured binary gating variables into multi-head attention to model higher-order dependencies, utilizing mean-field variati...	03-31 12:49	Pending	-	View
exp_2411.01399v1_20260331_124712 Paper: 2411.01399v1	2411.01399v1 Architecture: MambaReg introduces a hybrid architecture combining Convolutional Neural Networks (CNNs) with Mamba (State Space Models). It extracts local features via convolutions and processes global context via Mamba blocks to handle...	03-31 12:47	Pending	-	View
exp_2603.25722v1_20260331_124604 Paper: 2603.25722v1	2603.25722v1 Architecture: Modifies standard dual-encoder (Contrastive V&L) frameworks. Replaces final global pooling with parameter-free cross-modal attention-pooling to align concept-centric text segments with visual features. **Memory Footpri...	03-31 12:46	Pending	-	View
exp_2410.18794v2_20260331_124501 Paper: 2410.18794v2	2410.18794v2 Architecture: Hybrid model integrating a lightweight "predictor network" (CNN) with a hard-thresholded Convolutional Locally Competitive Algorithm (LCA) solver. The predictor performs "state warm-up," generating a high-quality initial g...	03-31 12:45	Pending	-	View
exp_hf_2603.13904_20260331_124253 Paper: hf_2603.13904	hf_2603.13904 Paper: CroBo (Visual States Need What-is-Where Composition) Architecture: CroBo is a self-supervised encoder-decoder framework designed to compress visual observations into a single, compact bottleneck token capturing "what-is-w...	03-31 12:42	Pending	-	View
exp_cr_10.3390_pr13071977_20260331_124202 Paper: cr_10.3390_pr13071977	cr_10.3390_pr13071977 Architecture: TransQwen is a specialized fine-tune of Qwen-7B-Chat utilizing DoRA (Weight-Decomposed Low-Rank Adaptation) for parameter-efficient updates and RoPE for positional encoding. This is a **weight-based learning approa...	03-31 12:42	Pending	-	View
exp_2412.00503v3_20260331_124105 Paper: 2412.00503v3	2412.00503v3 Architecture: The paper proposes integrating biological homeostasis mechanisms—RFB-kWTA (Random Feedback k-Winners-Take-All) and "Smart" Inhibition—into standard Transformer attention and output layers. These modules use running statist...	03-31 12:41	Pending	-	View
exp_cr_10.3390_info16050343_20260331_123838 Paper: cr_10.3390_info16050343	cr_10.3390_info16050343 Architecture: Introduces CPSE (encoding) and CPSD (decoding), a framework utilizing Sparse Binary Representations (SDRs) and triadic memory. It extends Context-Dependent Thinning (CDT) to manage nested compositional structures a...	03-31 12:38	Pending	-	View
exp_cr_10.3390_s25010064_20260331_123735 Paper: cr_10.3390_s25010064	cr_10.3390_s25010064 Architecture Built on Llama-3-8B-Instruct, optimized via LoRA to integrate multi-attribute inputs (historical interactions, driver emotion, vehicle/physics state). It functions as an encoder-decoder for intent prediction, treati...	03-31 12:37	Pending	-	View
exp_2410.16443v4_20260331_123638 Paper: 2410.16443v4	2410.16443v4 Architecture: CRATE (Coding RAte TransformEr) is a "white-box" Transformer variant that explicitly integrates sparse coding mechanisms—specifically coding rate minimization—directly into the network layers to capture low-dimensional dat...	03-31 12:36	Pending	-	View
exp_core_299002838_20260331_123415 Paper: core_299002838	core_299002838 This review surveys Transformer-based LLMs and multi-modal architectures for Prognostics and Health Management (PHM), specifically targeting deployment on resource-constrained industrial hardware. * Architecture: Focuses on adapting gen...	03-31 12:34	Pending	-	View
exp_2410.00340v3_20260331_123324 Paper: 2410.00340v3	2410.00340v3 Assessment: Low Relevance for Inference Optimization Architecture: No new model architecture proposed. The paper introduces a diagnostic tool using Singular Value Decomposition (SVD) on GPT-2 Small’s attention weight matrices to iso...	03-31 12:33	Pending	-	View
exp_2411.01399v1_20260331_123226 Paper: 2411.01399v1	2411.01399v1 Architecture: MambaReg introduces a hybrid architecture combining Convolutional Neural Networks (CNNs) with Mamba (State Space Models). It extracts local features via convolutions and processes global context via Mamba blocks to handle...	03-31 12:32	Pending	-	View
exp_hf_2603.24793_20260331_123006 Paper: hf_2603.24793	hf_2603.24793 Architecture AVControl is a modular framework built on the LTX-2 DiT architecture. It employs a "parallel canvas" mechanism, injecting control modalities (e.g., depth, pose, audio) as additional tokens within attention layers. Each cont...	03-31 12:30	Pending	-	View
exp_2506.24041v1_20260331_122908 Paper: 2506.24041v1	2506.24041v1 Fallback synthesis: Unsupervised Sparse Coding-based Spiking Neural Network for Real-time Spike Sorting. Potential 8GB relevance via sparse, inference, rag.	03-31 12:29	Pending	-	View
exp_2508.16915v3_20260331_122807 Paper: 2508.16915v3	2508.16915v3 Fallback synthesis: Reinforcement-Guided Hyper-Heuristic Hyperparameter Optimization for Fair and Explainable Spiking Neural Network-Based Financial Fraud Detection. Potential 8GB relevance via sparse, rag.	03-31 12:28	Pending	-	View
exp_2410.00340v3_20260331_122548 Paper: 2410.00340v3	2410.00340v3 Assessment: Low Relevance for Inference Optimization Architecture: No new model architecture proposed. The paper introduces a diagnostic tool using Singular Value Decomposition (SVD) on GPT-2 Small’s attention weight matrices to iso...	03-31 12:25	Pending	-	View
exp_pytrain.20260331121908.038_20260331_121941 Paper: pytrain.20260331121908.038	Python Skill Fallback Title: Robust Generic Plugin Registry with Metadata Simulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-31 12:20	Success	-	View
exp_2401.00243v1_20260331_120659 Paper: 2401.00243v1	2401.00243v1 Architecture: UP-RLHF introduces a training-time architecture utilizing an ensemble of diverse Low-Rank Adaptations (LoRAs) for the Reward Model (RM). Diversity is enforced by maximizing the nuclear norm of concatenated LoRA matrices. T...	03-31 12:06	Pending	-	View
exp_oa_W7139145681_20260331_120430 Paper: oa_W7139145681	CARE: Covariance-Aware and Rank-Enhanced Decomposition Benchmark Architecture: CARE converts Grouped-Query Attention (GQA) to Multi-Head Latent Attention (MLA). It replaces standard low-rank SVD baselines with activation-preserving factorization and adjusted-rank allocation, distributing rank...	03-31 12:05	Success	-	View
exp_gh_HyperKuvid-Labs_SpecQuant_20260331_120105 Paper: gh_HyperKuvid-Labs_SpecQuant	HyperKuvid-Labs/SpecQuant Architecture: Proposes an adaptive speculative decoding pipeline. A lightweight classifier routes inputs based on complexity to select specific quantized draft models. These drafts generate tokens verified by a larger FP16 target model....	03-31 12:02	Success	-	View
exp_pytrain.20260331113610.037_20260331_113637 Paper: pytrain.20260331113610.037	Python Skill Fallback Title: Protocol-Based Dynamic Module Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-31 11:37	Success	-	View
exp_cr_10.7717_peerj-cs.3388_20260331_112615 Paper: cr_10.7717_peerj-cs.3388	cr_10.7717_peerj-cs.3388 Fallback synthesis: Towards optimal sparse CNNs: sparsity-friendly knowledge distillation through feature decoupling. Potential 8GB relevance via sparse.	03-31 11:26	Pending	-	View
exp_2509.10033v1_20260331_112420 Paper: 2509.10033v1	Sparse Coding Representation of 2-way Data (AODL) Fallback synthesis: Sparse Coding Representation of 2-way Data. Potential 8GB relevance via linear, sparse.	03-31 11:25	Success	-	View
exp_2410.08003v6_20260331_112118 Paper: 2410.08003v6	More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routing Architecture: COMET replaces trainable gating networks with fixed, biologically-inspired random projections. It utilizes a modular, sparse architecture where experts overlap conditionally based on input similarity, rather than remaining...	03-31 11:22	Success	-	View
exp_pytrain.20260331105438.036_20260331_105507 Paper: pytrain.20260331105438.036	PEP 561 Compliant Package Scaffolder An autonomous coding system can effectively combine the 'packaging' module structure (PEP 561) with advanced 'typing' constructs (TypedDict, Protocol) to create a robust, metadata-aware build tool without relying on external dependencies li...	03-31 10:56	Success	-	View
exp_cr_10.1609_aaai.v38i12.29237_20260331_105149 Paper: cr_10.1609_aaai.v38i12.29237	OWQ Benchmark: Outlier-Aware Mixed-Precision Quantization Architecture: OWQ utilizes a sensitivity-aware, mixed-precision strategy. It isolates a small subset of structured "outlier" weights—typically sensitive to quantization—and retains them in high-precision (FP16). The remaining dense weig...	03-31 10:52	Success	-	View
exp_2603.25722v1_20260331_105047 Paper: 2603.25722v1	2603.25722v1 Architecture: Modifies standard dual-encoder (Contrastive V&L) frameworks. Replaces final global pooling with parameter-free cross-modal attention-pooling to align concept-centric text segments with visual features. **Memory Footpri...	03-31 10:50	Pending	-	View
exp_cr_10.3390_technologies13120587_20260331_104746 Paper: cr_10.3390_technologies13120587	CALM: Continual Associative Learning Model via Sparse Distributed Memory Fallback synthesis: CALM: Continual Associative Learning Model via Sparse Distributed Memory. Potential 8GB relevance via sparse, memory, inference, rag.	03-31 10:48	Success	-	View
exp_pytrain.20260331102158.035_20260331_102228 Paper: pytrain.20260331102158.035	Generic Extension Loader with Runtime Type Verification This benchmark tests a plugin architecture hypothesis: that explicit generic constraints (PEP 484/695) combined with dynamic module loading (importlib) create a more robust system by catching type mismatches at registration time rather than...	03-31 10:23	Success	-	View
exp_2411.02985v1_20260331_100820 Paper: 2411.02985v1	2411.02985v1 Architecture: Hybrid sparse coding model utilizing a concatenated dictionary (Zernike polynomials + complex modes) and a trainable affine transform layer. Inference relies on $L_1$-regularized optimization (sparse recovery) rather than...	03-31 10:08	Pending	-	View
exp_2603.26465v1_20260331_100654 Paper: 2603.26465v1	2603.26465v1 Architecture: A hybrid model enhancing standard Transformers with Boltzmann Machine constraints. It integrates structured binary gating variables into multi-head attention to model higher-order dependencies, utilizing mean-field variati...	03-31 10:06	Pending	-	View
exp_hf_2603.13904_20260331_100414 Paper: hf_2603.13904	hf_2603.13904 Paper: CroBo (Visual States Need What-is-Where Composition) Architecture: CroBo is a self-supervised encoder-decoder framework designed to compress visual observations into a single, compact bottleneck token capturing "what-is-w...	03-31 10:04	Pending	-	View
exp_cr_10.13052_dgaej2156-3306.40565_20260331_100317 Paper: cr_10.13052_dgaej2156-3306.40565	cr_10.13052_dgaej2156-3306.40565 Fallback synthesis: Energy Efficient Optimization of Current Transformer Error Compensation in Smart Grids Using Sparse Coding and Blockchain-Secured IoT Framework. Potential 8GB relevance via linear, sparse.	03-31 10:03	Pending	-	View
exp_2507.07136v2_20260331_100214 Paper: 2507.07136v2	2507.07136v2 Fallback synthesis: LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS. Potential 8GB relevance via sparse, inference, rag.	03-31 10:02	Pending	-	View
exp_cr_10.3390_info16050343_20260331_095938 Paper: cr_10.3390_info16050343	cr_10.3390_info16050343 Architecture: Introduces CPSE (encoding) and CPSD (decoding), a framework utilizing Sparse Binary Representations (SDRs) and triadic memory. It extends Context-Dependent Thinning (CDT) to manage nested compositional structures a...	03-31 09:59	Pending	-	View
exp_pytrain.20260331094205.034_20260331_094234 Paper: pytrain.20260331094205.034	Python Skill Fallback Title: Type-Safe Package Resource Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-31 09:43	Success	-	View
exp_pytrain.20260331090809.033_20260331_090847 Paper: pytrain.20260331090809.033	Type-Safe Dependency Injection Container Overview This benchmark tests the ability to implement a robust, structural-subtyping based Dependency Injection (DI) container using only Python's standard library. Objective Implement a `Container` class that leverages `typing.Protocol` t...	03-31 09:09	Success	-	View
exp_2411.13117v2_20260331_085323 Paper: 2411.13117v2	Benchmark: Amortisation Gap in Sparse Autoencoders Architecture: Proposes decoupling the SAE pipeline. Replaces the standard single-pass linear encoder with iterative sparse inference algorithms (e.g., optimization-based solvers like ISTA) to recover accurate latent codes, while retaini...	03-31 08:54	Success	-	View
exp_pytrain.20260331082649.032_20260331_082714 Paper: pytrain.20260331082649.032	Strict Package Introspection & Typed Configuration Validator Benchmark This benchmark evaluates the robustness of a Python coding system in implementing strict type safety, Generic programming, and runtime environment introspection using only the Python Standard Library. Objective Create a dependency managemen...	03-31 08:28	Success	-	View
exp_2603.26323v1_20260331_081234 Paper: 2603.26323v1	This benchmark tests the Computational Primitives of Spatial Reasoning in Large Language Models, inspired by recent... Assessment for ARES 8GB Roadmap This paper investigates the internal spatial reasoning capabilities of standard multilingual Transformer architectures using linear probing and sparse autoencoders. It decomposes reasoning into three prim...	03-31 08:13	Success	-	View
exp_pytrain.20260331074457.031_20260331_074525 Paper: pytrain.20260331074457.031	Dynamic Namespace Package Loader with Structural Type Validation This benchmark tests the ability of a Python script to dynamically generate a distributable package structure (zip archive), load it at runtime, and enforce strict structural typing (using `typing.Protocol`) to validate modules without requ...	03-31 07:46	Success	-	View
exp_oa_W4393064007_20260331_073141 Paper: oa_W4393064007	MELT Benchmark Suite: Local Simulation Paper: MELTing point: Mobile Evaluation of Language Transformers Type: Infrastructure/Benchmarking Study (Not RAG/Retrieval). Architecture: Introduces MELT, a headless benchmarking framework for evaluating instruction-tune...	03-31 07:32	Success	-	View
exp_pytrain.20260331070541.030_20260331_070609 Paper: pytrain.20260331070541.030	Strictly-Typed Namespace Dispatcher Drill This benchmark validates your ability to implement a strictly-typed command pattern using Python's `typing.Protocol`. The objective is to create a robust, type-safe plugin dispatcher system without external dependencies. Instructions 1. Ens...	03-31 07:07	Success	-	View
exp_2411.02199v5_20260331_065052 Paper: 2411.02199v5	Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning Paper: Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning Classification: Theoretical Analysis (Non-Engineering) Roadmap Relevance: Low / None. This paper provides a mathematical proof r...	03-31 06:51	Success	-	View
exp_pytrain.20260331063050.029_20260331_063117 Paper: pytrain.20260331063050.029	Runtime Type-Checked Dynamic Plugin Loader This benchmark evaluates the ability to construct a robust, type-safe plugin loading mechanism using Python's standard library. The task is to dynamically load Python modules from a filesystem path and strictly validate their interface agai...	03-31 06:32	Success	-	View
exp_pytrain.20260331055833.028_20260331_055902 Paper: pytrain.20260331055833.028	Strictly Typed Processor & Modern Packaging Generation Benchmark Overview This benchmark evaluates the ability of a Python script to dynamically construct a valid, modern Python project structure compliant with PEP 621 (using `pyproject.toml`) and generate strictly typed source code utilizing Generics an...	03-31 06:00	Success	-	View
exp_pytrain.20260331052437.027_20260331_052537 Paper: pytrain.20260331052437.027	Typed Neural Architecture Registry with Dynamic Plugin Loading Overview This coding drill benchmarks your ability to design a strictly typed, modular Python framework that mimics the architecture of modern deep learning libraries (like PyTorch or LitGPT). The Challenge You are tasked with implementing...	03-31 05:26	Success	-	View
exp_2603.26365v1_20260331_050240 Paper: 2603.26365v1	SCORE: Dynamic Token Compression Benchmark Architecture: SCORE utilizes a lightweight policy network conditioned on inter-frame residuals ("surprise") to dynamically prune redundant visual tokens. Unlike static merging, it employs Group-wise Reinforcement Learning (RL) to learn...	03-31 05:03	Success	-	View
exp_pytrain.20260331043016.026_20260331_043054 Paper: pytrain.20260331043016.026	Lazy-Loading Submodule Proxy with Type Safety Design Brief This benchmark implements a lazy-loading mechanism designed to minimize the startup overhead of Python applications that depend on heavy libraries (e.g., `torch`, `numpy`, `tensorflow`). This pattern is commonly found in high-p...	03-31 04:31	Success	-	View
exp_pytrain.20260331035202.025_20260331_035256 Paper: pytrain.20260331035202.025	Robust Plugin Loader with Runtime Type Verification Overview This coding drill tests the ability to construct a zero-dependency plugin management system using Python's standard library. The system simulates a package environment where code modules are discovered, loaded, and validated agains...	03-31 03:53	Success	-	View
exp_pytrain.20260331031049.024_20260331_031204 Paper: pytrain.20260331031049.024	The Modular Typed CLI Benchmark This benchmark verifies the architectural robustness of a Python module designed according to strict typing and separation of concerns principles. Objective The benchmark validates a generated module (`data_processor.py`) against three spec...	03-31 03:13	Success	-	View
exp_pytrain.20260331023453.023_20260331_023542 Paper: pytrain.20260331023453.023	Dynamic Type-Safe Plugin Loader This benchmark evaluates the implementation of a robust, loosely-coupled plugin system using Python's standard library. It demonstrates runtime component discovery and validation by defining a strict `typing.Protocol`, dynamically generatin...	03-31 02:36	Success	-	View
exp_pytrain.20260331015839.022_20260331_015931 Paper: pytrain.20260331015839.022	Generic Datastore Benchmark (PEP 695) Overview This benchmark evaluates the implementation of a generic datastore using Python 3.12's Type Parameter Syntax (PEP 695). It verifies type safety, packaging hygiene (`__all__`, `__version__`), and CLI integration using only the Pytho...	03-31 02:00	Success	-	View
exp_pytrain.20260331012202.021_20260331_012236 Paper: pytrain.20260331012202.021	Strictly Typed Plugin Registry Benchmark Objective This benchmark evaluates the ability to write robust, production-grade Python code using advanced standard library features. It tests adherence to strict type checking (`mypy --strict`), packaging hygiene (`__all__`, `__version__`...	03-31 01:23	Success	-	View
exp_2603.26434v1_20260331_010325 Paper: 2603.26434v1	Automating Clinical Information Retrieval from Finnish Electronic Health Records Using Large Language Models Paper: Automating Clinical Information Retrieval from Finnish EHRs Architecture: Clinical Contextual Question Answering (CCQA) framework utilizing open-source LLMs (Llama-3.1-70B, Qwen3-30B) for offline inference on Finnish clinical...	03-31 01:04	Success	-	View
exp_pytrain.20260331003905.020_20260331_003926 Paper: pytrain.20260331003905.020	Python Reliability Drill: Robust Typing & Telemetry Overview This benchmark evaluates your ability to write robust, type-safe Python code using standard library type hints (`typing` module) without external dependencies. The task is to implement a `TypeSafeContainer` that enforces strict typ...	03-31 00:40	Success	-	View
exp_2512.19720v1_20260331_002859 Paper: 2512.19720v1	Benchmark: Per-Axis 1-Bit Weight Deltas Architecture: Proposes a 1-bit delta scheme where fine-tuned weights are stored as the sign of the difference ($\pm 1$) from a base model, augmented with learned per-axis (row/column) FP16 scaling factors derived from a small ca...	03-31 00:30	Success	-	View
exp_pytrain.20260331000621.019_20260331_000656 Paper: pytrain.20260331000621.019	Typed Configuration Dispatch System This benchmark simulates a core component of a machine learning inference framework (similar in design philosophy to Hugging Face `transformers` or `diffusers`). It utilizes Python's static typing features (`Protocol`, `TypedDict`) to decou...	03-31 00:07	Success	-	View
exp_2312.17493v2_20260330_235343 Paper: 2312.17493v2	Benchmark for DP-LoRA Architecture: DP-LoRA integrates Federated Learning (FL) with Low-Rank Adaptation. Clients train lightweight LoRA adapters locally, while a Gaussian mechanism injects noise into weight updates to ensure Differential Privacy (DP), preven...	03-30 23:54	Success	-	View
exp_pytrain.20260330232806.018_20260330_232832 Paper: pytrain.20260330232806.018	Generic Plugin Registry with Runtime Type Validation Hypothesis: Utilizing `typing.Protocol` combined with Generics provides a strict contract for interoperability within a package ecosystem, enabling `importlib`/`inspect`-based loaders to validate plugin compatibility at runtime. This en...	03-30 23:29	Success	-	View
exp_2603.26595v1_20260330_231342 Paper: 2603.26595v1	PQuantML: A Tool for End-to-End Hardware-aware Model Compression PQuantML: End-to-End Hardware-Aware Compression * Architecture: PQuantML is an open-source library providing a unified interface for model compression. It supports structured and unstructured pruning alongside fixed-point quantizati...	03-30 23:14	Success	-	View
exp_pytrain.20260330224418.017_20260330_224455 Paper: pytrain.20260330224418.017	Python Skill Fallback Title: Robust Package Metadata Validator and Entry Point Simulator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 22:45	Success	-	View
exp_oa_W4413681814_20260330_222838 Paper: oa_W4413681814	Dynamic Precision Quantization for Iterative Generative Models Summary: This survey reviews quantization strategies to mitigate the high computational and memory costs of diffusion models. * Architecture: Focuses on the sensitivity of hierarchical, iterative denoising architectures where quanti...	03-30 22:29	Success	-	View
exp_pytrain.20260330215859.016_20260330_215926 Paper: pytrain.20260330215859.016	Dynamic Type-Verified Package Generator This coding drill benchmarks an autonomous system's ability to dynamically scaffold a Python package structure, enforce strict typing via the `typing` module, and validate the module's interface using `importlib` introspection without relyi...	03-30 22:00	Success	-	View
exp_oa_W4413364992_20260330_214542 Paper: oa_W4413364992	Benchmarking Unified Quantization in Generative AI Architecture: This paper is a technical survey of quantization strategies applicable to large-scale autoregressive transformers and diffusion models. It focuses on unified, differentiable quantization frameworks designed to handle the n...	03-30 21:46	Success	-	View
exp_pytrain.20260330211700.015_20260330_211737 Paper: pytrain.20260330211700.015	Typed Plugin Registry Benchmark This coding drill verifies the implementation of a robust, type-safe Plugin Registry using Python 3.12+ features (PEP 695). Features - Modern Syntax: Uses `type` alias statements and generic class parameter syntax (e.g., `class Registry...	03-30 21:18	Success	-	View
exp_cr_10.1609_aaai.v40i32.39899_20260330_210141 Paper: cr_10.1609_aaai.v40i32.39899	RCMoE Benchmark Architecture: RCMoE targets Mixture-of-Experts (MoE) models to reduce the "All-to-All" communication bottleneck. It utilizes Local-Stochastic Quantization to compress intermediate expert outputs row-by-row and **Probabilistic Thresh...	03-30 21:03	Success	-	View
exp_pytrain.20260330203515.014_20260330_203545 Paper: pytrain.20260330203515.014	Python Skill Fallback Title: Dynamic Model Registry with Structural Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 20:36	Success	-	View
exp_oa_W7128864297_20260330_202316 Paper: oa_W7128864297	MiniCPM-SALA Attention Mechanism Benchmark Architecture: 9B parameter model hybridizing Sparse (InfLLM-V2) and Linear (Lightning) attention in a 1:3 ratio using Hybrid Positional Encoding (HyPE) to balance local fidelity with global efficiency. Memory Footprint: Linear atten...	03-30 20:24	Success	-	View
exp_pytrain.20260330195549.013_20260330_195628 Paper: pytrain.20260330195549.013	Strictly-Typed Modular Plugin Registry This benchmark implements a zero-dependency plugin architecture using Python's `typing.Protocol` and `typing.runtime_checkable`. It demonstrates the creation of a `SystemRegistry` capable of runtime type validation and automatic discovery o...	03-30 19:57	Success	-	View
exp_2603.26603v1_20260330_194137 Paper: 2603.26603v1	Benchmark: On-Device LLM Efficiency & Quantization Paradox Summary for ARES 8GB Roadmap This paper provides an empirical analysis of on-device LLMs (0.5B–9B) regarding energy, latency, and quality, utilizing a Samsung Galaxy S25 Ultra. * Architecture: The study identifies **Mixture-of-Exper...	03-30 19:42	Success	-	View
exp_pytrain.20260330191615.012_20260330_191640 Paper: pytrain.20260330191615.012	Type-Safe Dynamic Plugin Loader Benchmark This benchmark tests the system's ability to construct a robust, type-safe plugin architecture using only Python's standard library. It specifically targets advanced features such as structural subtyping (using `typing.Protocol`), dynamic m...	03-30 19:17	Success	-	View
exp_oa_W4400337965_20260330_190205 Paper: oa_W4400337965	KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches This benchmark evaluates 10+ techniques to mitigate KV cache memory growth, the primary bottleneck for long-context inference on 8GB VRAM hardware. * Architecture: Provides a taxonomy of efficiency-focused approaches, including KV quant...	03-30 19:03	Success	-	View
exp_pytrain.20260330183329.011_20260330_183410 Paper: pytrain.20260330183329.011	Coding Drill: Protocol-Based Namespace Loader Objective Design a robust, single-file Python script that implements a dynamic plugin loader. This system leverages Python's structural subtyping (Protocols) to enforce interface compliance without explicit inheritance. The script must simu...	03-30 18:35	Success	-	View
exp_2401.00503v1_20260330_181942 Paper: 2401.00503v1	Backfill Candidate 2401.00503v1 Architecture: Viz proposes a marketplace framework integrating QLoRA to decouple frozen base model weights from trainable adapters. This architecture facilitates a copyright-compliant ecosystem where content licensing is managed explici...	03-30 18:20	Success	-	View
exp_pytrain.20260330175336.010_20260330_175411 Paper: pytrain.20260330175336.010	Python Skill Fallback Title: Strictly Typed Asynchronous Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 17:55	Success	-	View
exp_cr_10.55041_ijsrem43474_20260330_173844 Paper: cr_10.55041_ijsrem43474	Developing New AI Model Compression Techniques This survey reviews foundational compression techniques—pruning, quantization, and knowledge distillation—aimed at enabling edge AI. * Architecture: Validates lightweight backbones (MobileNet, SqueezeNet) and structural sparsity as effe...	03-30 17:39	Success	-	View
exp_pytrain.20260330171056.009_20260330_171136 Paper: pytrain.20260330171056.009	Python Skill Fallback Title: Dynamic ZipApp Packager with Runtime Type Verification - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 17:12	Success	-	View
exp_core_159796903_20260330_165618 Paper: core_159796903	Benchmark: Transformer vs. Efficient SSM (Mamba-style) Architecture Summary for ARES 8GB Roadmap * Architecture: Surveys compression techniques targeting standard Transformer Attention/FFN blocks. Contrasts these with inherently efficient architectures (Mamba, RetNet, RWKV) designed to replace atten...	03-30 16:57	Success	-	View
exp_pytrain.20260330162958.008_20260330_163042 Paper: pytrain.20260330162958.008	Dynamic Module Loader with Strict Generic Typing Overview This coding drill benchmarks a robust, runtime-verified plugin architecture built entirely with the Python Standard Library. It demonstrates the synergy between PEP 695 (Type Parameter Syntax) and Python's native import machine...	03-30 16:31	Success	-	View
exp_oa_W4391766345_20260330_161648 Paper: oa_W4391766345	A Survey on Transformer Compression Architecture: Reviews compression techniques for standard Transformers (Attention/FFN blocks) and efficient architectures like Mamba, RetNet, and RWKV that utilize linear-complexity mechanisms to bypass quadratic attention constraints....	03-30 16:17	Success	-	View
exp_pytrain.20260330154854.007_20260330_154933 Paper: pytrain.20260330154854.007	Python Skill Fallback Title: Typed Plugin Discovery System - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 15:50	Success	-	View
exp_hf_2603.18742_20260330_153604 Paper: hf_2603.18742	6Bit-Diffusion Benchmark Architecture: Proposes an inference-time mixed-precision quantization framework (NVFP4/INT8) and Temporal Delta Cache (TDC) for Video Diffusion Transformers (DiTs). A lightweight predictor dynamically allocates NVFP4 to temporally stabl...	03-30 15:37	Success	-	View
exp_pytrain.20260330151254.006_20260330_151331 Paper: pytrain.20260330151254.006	Strictly-Typed Component Registry and Serialization Benchmark Objective This benchmark tests the ability to construct a robust, plugin-based architecture reminiscent of Hugging Face `diffusers` or `vLLM` using only the Python standard library. Core Concepts 1. Protocol-Based Design: Using `typing....	03-30 15:14	Success	-	View
exp_2412.08890v1_20260330_150104 Paper: 2412.08890v1	Lexico KV Cache Compression Benchmark Architecture Lexico replaces the standard KV cache with a sparse coding framework. It utilizes a small, input-agnostic dictionary of ~4k atoms to reconstruct attention vectors. The encoding process employs **Orthogonal Matching Purs...	03-30 15:02	Success	-	View
exp_pytrain.20260330143502.005_20260330_143540 Paper: pytrain.20260330143502.005	Python Skill Fallback Title: Strict Protocol-Based Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 14:36	Success	-	View
exp_oa_W7133137559_20260330_142201 Paper: oa_W7133137559	This benchmark validates the core architectural efficiency claims described in the paper regarding "Tokens as Computatio... Architecture: Theoretical analysis of Transformer embeddings and the $O(n^2)$ complexity of attention mechanisms. Reviews optimization techniques including token pruning, sparse attention, and long-context extensions. **Memory Footprint...	03-30 14:23	Success	-	View
exp_pytrain.20260330135546.004_20260330_135618 Paper: pytrain.20260330135546.004	Python Reliability Drill: Typing & Packaging This benchmark demonstrates the creation of a robust, type-safe data processing utility using only the Python Standard Library. It focuses on strict type checking enforcement at runtime to ensure reliability, utilizing advanced `typing` mod...	03-30 13:57	Success	-	View
exp_oa_W7125352730_20260330_134113 Paper: oa_W7125352730	LLMOrbit: The Efficiency Revolution Benchmark LLMOrbit is a survey analyzing 50+ models to identify efficiency paradigms critical for the ARES 8GB roadmap. It highlights a shift from brute-force scaling to architectural optimization to overcome data scarcity and hardware costs. * *...	03-30 13:42	Success	-	View
exp_pytrain.20260330131606.003_20260330_131652 Paper: pytrain.20260330131606.003	Dynamic Type-Safe Plugin Loader Benchmark This benchmark evaluates a Python architecture that enforces strict type safety on dynamically loaded modules. It tests the hypothesis that `typing.Protocol` combined with `importlib` provides a robust, zero-dependency mechanism for plugin...	03-30 13:17	Success	-	View
exp_cr_10.1145_3725338_20260330_130141 Paper: cr_10.1145_3725338	PQCache Benchmark Architecture & Retrieval Strategy: PQCache reframes KV cache management as an embedding retrieval task. It utilizes Product Quantization (PQ) to compress token keys into compact codes during the prefill phase. During decoding, i...	03-30 13:02	Success	-	View
exp_pytrain.20260330123406.002_20260330_123446 Paper: pytrain.20260330123406.002	Python Skill Fallback Title: Generic Data Buffer with PEP 695 Type Parameters - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 12:35	Success	-	View
exp_oa_W4405434119_20260330_121731 Paper: oa_W4405434119	SCBench: Shared Context Benchmark Evaluation Paper: SCBench: A KV Cache-Centric Analysis of Long-Context Methods Relevance to ARES 8GB Roadmap: This paper provides a critical benchmark for optimizing the KV cache lifecycle, specifically for shared contexts (e.g., syste...	03-30 12:20	Success	-	View
exp_pytrain.20260330115255.001_20260330_115322 Paper: pytrain.20260330115255.001	Generic Type-Safe Service Locator This benchmark tests the ability to construct a robust, modular Dependency Injection (DI) container using Python's standard `typing` module. Objective Implement a `ServiceLocator` that decouples interface definitions (Protocols) from concre...	03-30 11:54	Success	-	View
exp_oa_W4405434119_20260330_114104 Paper: oa_W4405434119	SCBench: KV Cache Shared-Context Evaluation Paper: SCBench: A KV Cache-Centric Analysis of Long-Context Methods Relevance to ARES 8GB Roadmap: This paper provides a critical benchmark for optimizing the KV cache lifecycle, specifically for shared contexts (e.g., syste...	03-30 11:41	Pending	-	View
exp_pytrain.20260330111810.001_20260330_111845 Paper: pytrain.20260330111810.001	Type-Safe Dynamic Plugin Loader Benchmark Overview This benchmark demonstrates the implementation of a robust, type-safe plugin system in Python using structural subtyping (`typing.Protocol`) and dynamic module loading (`importlib`). The Hypothesis Using `Protocol` combined with `r...	03-30 11:19	Success	-	View
exp_oa_W4405434119_20260330_110632 Paper: oa_W4405434119	SCBench: Lightweight KV Cache Benchmark Paper: SCBench: A KV Cache-Centric Analysis of Long-Context Methods Relevance to ARES 8GB Roadmap: This paper provides a critical benchmark for optimizing the KV cache lifecycle, specifically for shared contexts (e.g., syste...	03-30 11:06	Pending	-	View
exp_pytrain.20260330103658.001_20260330_103730 Paper: pytrain.20260330103658.001	Dynamic Package Entry Point Validator This benchmark tests the ability to design a robust, type-safe package installation simulator using Python's standard `typing` module. Objective Implement a `validate_and_install` function that enforces strict adherence to: 1. **Data Contra...	03-30 10:38	Success	-	View
exp_oa_W4405434119_20260330_102231 Paper: oa_W4405434119	SCBench: Lightweight KV Cache Evaluation Paper: SCBench: A KV Cache-Centric Analysis of Long-Context Methods Relevance to ARES 8GB Roadmap: This paper provides a critical benchmark for optimizing the KV cache lifecycle, specifically for shared contexts (e.g., syste...	03-30 10:22	Pending	-	View
exp_pytrain.20260330095143.001_20260330_095221 Paper: pytrain.20260330095143.001	Python Skill Fallback Title: Type-Safe Plugin Dispatcher with Protocols - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 09:53	Success	-	View
exp_oa_W4405434119_20260330_093714 Paper: oa_W4405434119	SCBench: A KV Cache-Centric Analysis of Long-Context Methods Paper: SCBench: A KV Cache-Centric Analysis of Long-Context Methods Relevance to ARES 8GB Roadmap: This paper provides a critical benchmark for optimizing the KV cache lifecycle, specifically for shared contexts (e.g., syste...	03-30 09:37	Pending	-	View
exp_pytrain.20260330091417.033_20260330_091450 Paper: pytrain.20260330091417.033	Strict Protocol-Based Plugin System with Dynamic Packaging This benchmark evaluates an engineering system's ability to construct a robust, modular architecture using advanced Python type hinting (`typing.Protocol`) and dynamic module loading (`importlib`). Objective The benchmark programmatically s...	03-30 09:15	Success	-	View
exp_pytrain.20260330083653.032_20260330_083727 Paper: pytrain.20260330083653.032	Dynamic Plugin Registry Benchmark This benchmark evaluates the system's ability to construct a robust, framework-style plugin loader using only the Python standard library. Objective Implement a `ModelRegistry` that: 1. Defines a strict `ModelProtocol` using `typing.Protoco...	03-30 08:38	Success	-	View
exp_pytrain.20260330075617.031_20260330_075656 Paper: pytrain.20260330075617.031	Strictly-Typed Dependency Resolver Simulator Overview This benchmark implements a robust package resolution engine using Python's strict type system. It demonstrates the usage of `typing.Protocol`, `typing.Generic`, `@total_ordering`, and `dataclasses` to enforce compile-time logic co...	03-30 07:57	Success	-	View
exp_pytrain.20260330071452.030_20260330_071521 Paper: pytrain.20260330071452.030	Strictly Typed Dynamic Component Loader This coding drill verifies the hypothesis that combining `typing.Protocol` with `importlib` enables the creation of robust, modular systems. Overview The benchmark script (`benchmark.py`) simulates an extensible asynchronous application. It...	03-30 07:16	Success	-	View
exp_pytrain.20260330063706.029_20260330_063756 Paper: pytrain.20260330063706.029	Type-Safe Dynamic Service Locator This coding drill evaluates your ability to implement a robust dependency injection mechanism using Python's standard library. The challenge involves constructing a generic `ServiceLocator` that dynamically loads modules via `importlib` and...	03-30 06:38	Success	-	View
exp_pytrain.20260330055723.028_20260330_055757 Paper: pytrain.20260330055723.028	Strictly Typed Dependency Resolver This benchmark evaluates the implementation of a robust dependency resolution system using Python's advanced standard library typing features. The goal is to ensure type safety, structural subtyping (via Protocols), and runtime integrity du...	03-30 05:58	Success	-	View
exp_pytrain.20260330051735.027_20260330_051805 Paper: pytrain.20260330051735.027	Python Skill Fallback Title: Strictly-Typed Dynamic Module Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 05:19	Success	-	View
exp_pytrain.20260330043314.026_20260330_043344 Paper: pytrain.20260330043314.026	Typed Plugin Registry and Configuration Validator This benchmark tests the ability to design a robust, type-safe plugin architecture similar to those found in vLLM or Diffusers. It enforces strict interface compliance using `typing.Protocol`, centralizes component management via a Registry...	03-30 04:34	Success	-	View
exp_pytrain.20260330040024.025_20260330_040051 Paper: pytrain.20260330040024.025	Typed Dynamic Package Loader This benchmark evaluates the ability to construct a Python runtime environment programmatically. The candidate script must define a strict type contract using the `typing` module, materialize a package directory structure on the physical di...	03-30 04:01	Success	-	View
exp_pytrain.20260330032327.024_20260330_032419 Paper: pytrain.20260330032327.024	Python Reliability Drill: Typing Overview This drill tests the ability to implement a robust, type-safe utility class in Python without relying on external type checkers. The `StrictTypeRegistry` class enforces runtime type checking for object storage and retrieval, ensuri...	03-30 03:25	Success	-	View
exp_pytrain.20260330024756.023_20260330_024837 Paper: pytrain.20260330024756.023	Dynamic Package Instantiation and Type Verification This benchmark tests the ability to programmatically generate Python package structures, write strictly typed code, dynamically import the code, and verify its compliance with a defined `typing.Protocol` interface. Description The script pe...	03-30 02:49	Success	-	View
exp_pytrain.20260330021439.022_20260330_021528 Paper: pytrain.20260330021439.022	PEP 695 Generic Repository & Dynamic Packaging Benchmark This benchmark evaluates an autonomous coding system's ability to leverage Python 3.12+ Type Parameter Syntax (PEP 695) and dynamic module packaging mechanics within a single executable script. Features * PEP 695 Syntax: Defines generic...	03-30 02:16	Success	-	View
exp_pytrain.20260330013932.021_20260330_014008 Paper: pytrain.20260330013932.021	Python Skill Fallback Title: Strictly-Typed Generic Module with Encapsulated API - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 01:41	Success	-	View
exp_pytrain.20260330005440.020_20260330_005509 Paper: pytrain.20260330005440.020	Python Skill Fallback Title: Strictly Typed Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-30 00:56	Success	-	View
exp_pytrain.20260330000834.019_20260330_000912 Paper: pytrain.20260330000834.019	Strictly Typed Dynamic Plugin Loader Hypothesis: Developing a modular architecture similar to HuggingFace Transformers requires mastery of advanced `typing` (Protocols, Generics) to define strict contracts and `importlib` to manage dynamic component discovery, ensuring ext...	03-30 00:10	Success	-	View
exp_pytrain.20260329232402.018_20260329_232443 Paper: pytrain.20260329232402.018	Python Skill Fallback Title: Strictly Typed Plugin Registry with Dynamic Discovery - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 23:25	Success	-	View
exp_pytrain.20260329224257.017_20260329_224312 Paper: pytrain.20260329224257.017	Dynamic Plugin Registry with Structural Subtyping This benchmark tests a Python system's ability to dynamically discover, load, and validate plugins based on structural subtyping (Protocols) rather than explicit inheritance. Objective Create a self-contained script that: 1. Generates a tem...	03-29 22:44	Success	-	View
exp_pytrain.20260329215852.016_20260329_215950 Paper: pytrain.20260329215852.016	Runtime Type-Safe Plugin Packaging Benchmark This benchmark demonstrates advanced Python module internals by dynamically generating a plugin package structure at runtime, loading it via the import system, and enforcing strict structural typing constraints using `typing.Protocol`. Acce...	03-29 22:00	Success	-	View
exp_pytrain.20260329211824.015_20260329_211849 Paper: pytrain.20260329211824.015	Python Skill Fallback Title: Type-Safe Dependency Injection Container - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 21:19	Success	-	View
exp_pytrain.20260329203623.014_20260329_203653 Paper: pytrain.20260329203623.014	Python Skill Fallback Title: Dynamic Package Construction and Importlib Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 20:37	Success	-	View
exp_pytrain.20260329195600.013_20260329_195626 Paper: pytrain.20260329195600.013	Python Skill Fallback Title: Robust Type-Checked Plugin Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 19:57	Success	-	View
exp_pytrain.20260329192217.012_20260329_192246 Paper: pytrain.20260329192217.012	Generic Model Registry with Runtime Type Validation This drill implements a robust, modular component loader similar to those used in Hugging Face Transformers or PyTorch. It leverages Python's advanced `typing` features—specifically `Protocol`, `TypeVar`, and `Generic`—to ensure that dynami...	03-29 19:23	Success	-	View
exp_pytrain.20260329183827.011_20260329_183857 Paper: pytrain.20260329183827.011	Log Analysis System Design Drill This drill challenges you to construct a robust, strictly-typed command-line interface (CLI) application in Python. The objective is to process simulated web server logs and generate statistics while demonstrating high-level software archit...	03-29 18:40	Success	-	View
exp_pytrain.20260329180406.010_20260329_180429 Paper: pytrain.20260329180406.010	Type-Safe Plugin Architecture Simulator Benchmark This benchmark evaluates the design and execution of a strictly typed, concurrent plugin system simulated within a single Python script. It enforces modern Python packaging standards (`__version__`, `__all__`) and utilizes advanced typing f...	03-29 18:05	Success	-	View
exp_pytrain.20260329173028.009_20260329_173101 Paper: pytrain.20260329173028.009	Strictly-Typed Modular Resource Processor Benchmark This benchmark assesses the ability to implement a robust, type-safe data processing pipeline using Python's advanced typing features. The candidate must construct a script that simulates a modular package structure, leveraging `Generic`, `...	03-29 17:32	Success	-	View
exp_pytrain.20260329165531.008_20260329_165601 Paper: pytrain.20260329165531.008	Python Skill Fallback Title: Type-Safe Dependency Resolver Simulator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 16:57	Success	-	View
exp_pytrain.20260329161900.007_20260329_161927 Paper: pytrain.20260329161900.007	Type-Safe Plugin Registry with Dynamic Discovery This coding drill focuses on building a robust, generic plugin system using Python's advanced standard library features, specifically `typing`, `importlib`, and `inspect`. Objective Create a self-contained Python module that implements a ty...	03-29 16:20	Success	-	View
exp_pytrain.20260329154614.006_20260329_154649 Paper: pytrain.20260329154614.006	Type-Safe Plugin Architecture with Resource Encapsulation Overview This benchmark simulates the creation of a robust, production-ready Python package infrastructure. It constructs a local package named `ml_infra` that demonstrates type safety using `typing.Protocol` and robust resource management...	03-29 15:47	Success	-	View
exp_pytrain.20260329151219.005_20260329_151245 Paper: pytrain.20260329151219.005	Python Skill Fallback Title: Strictly-Typed Plugin Registry System - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 15:13	Success	-	View
exp_pytrain.20260329142919.004_20260329_143002 Paper: pytrain.20260329142919.004	Python Skill Fallback Title: Strictly Typed Modular Task Runner - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 14:31	Success	-	View
exp_pytrain.20260329134539.003_20260329_134557 Paper: pytrain.20260329134539.003	Strict Typed Dynamic Extension Loader This benchmark validates an autonomous agent's ability to construct a robust, dependency-free plugin system using Python's standard library. Objective The goal is to programmatically generate a temporary Python package containing multiple m...	03-29 13:47	Success	-	View
exp_pytrain.20260329130847.002_20260329_130928 Paper: pytrain.20260329130847.002	Python Skill Fallback Title: Generic Configuration Manager with PEP 695 Syntax - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 13:10	Success	-	View
exp_pytrain.20260329122856.001_20260329_122920 Paper: pytrain.20260329122856.001	Structural Typing and Dynamic Module Loading Overview This benchmark evaluates a script's ability to leverage Python's `typing` module for structural subtyping (Protocols) and `importlib` for dynamic package loading. It simulates a plugin system where a Python package is constructed a...	03-29 12:30	Success	-	View
exp_pytrain.20260329113221.005_20260329_113304 Paper: pytrain.20260329113221.005	Dynamic Plugin Architecture with Structural Typing Objective Design a robust, extensible plugin system leveraging Python's `importlib` for runtime module discovery and `typing.Protocol` for enforcing strict interface compliance without explicit inheritance. Scenario You are building a data...	03-29 11:34	Success	-	View
exp_pytrain.20260329105253.004_20260329_105323 Paper: pytrain.20260329105253.004	Python Skill Fallback Title: Dynamic Plugin Loader with Type-Safe Interface Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 10:54	Success	-	View
exp_pytrain.20260329101727.003_20260329_101751 Paper: pytrain.20260329101727.003	Dynamic Type-Verified Package Constructor Benchmark Overview This benchmark evaluates a system's ability to programmatically synthesize a valid Python package structure at runtime. It verifies that the generated code adheres to strict `typing.Protocol` definitions and can be successfully int...	03-29 10:18	Success	-	View
exp_pytrain.20260329094511.002_20260329_094543 Paper: pytrain.20260329094511.002	Modern Generic Data Container Benchmark (PEP 695) Overview This benchmark evaluates the implementation of a generic, thread-safe data container utilizing Python 3.12's PEP 695 Type Parameter Syntax. It verifies the developer's ability to define scoped type parameters, constrained types...	03-29 09:46	Success	-	View
exp_pytrain.20260329085930.001_20260329_085953 Paper: pytrain.20260329085930.001	Generic Plugin Loader with Namespace Hygiene Overview This benchmark validates a Python implementation of a robust, type-safe event processing system using only the standard library. It enforces strict structural subtyping (Protocol-based), generic programming, and namespace hygiene s...	03-29 09:00	Success	-	View
exp_pytrain.20260329083145.001_20260329_083245 Paper: pytrain.20260329083145.001	Python Skill Fallback Title: Strictly-Typed Plugin Registry with Structural Subtyping - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:33	Success	-	View
exp_pytrain.20260329081229.001_20260329_081251 Paper: pytrain.20260329081229.001	Robust Typed Plugin System Benchmark This benchmark evaluates the implementation of a strictly typed plugin system using Python's standard `typing` module features introduced in recent versions (specifically `Protocol`, `TypeVar`, and `Generic`). Context The script `benchmark....	03-29 08:13	Success	-	View
exp_2302.00100v2_20260306_173656 Paper: 2302.00100v2	This benchmark evaluates the performance of a Physics-Informed Reduced-Order Model (PI-ROM) for simulating the Time-... README.md This benchmark evaluates the performance of a Physics-Informed Reduced-Order Model (PI-ROM) for simulating the Time-Dependent Schrödinger Equation (TDSE), as described in the innovation 2302.00100v2. The goal is to demonstra...	03-29 08:01	Success	-	View
exp_2302.00107v1_20260306_172525 Paper: 2302.00107v1	Benchmark: Sequential Adaptive Aggregation for Federated GLMs README.md Benchmark: Sequential Adaptive Aggregation for Federated GLMs This benchmark implements the Sequential Data-Driven Aggregation method described in paper 2302.00107v1. It demonstrates the improvement in statistical integrity an...	03-29 08:01	Success	-	View
exp_2302.00129v1_20260307_053731 Paper: 2302.00129v1	Explanation of the Benchmark Design This benchmark evaluates the core claim of the innovation: Efficiency without Optimization. The paper argues that the topological efficiency of syntactic structures (short dependency lengths) arises naturally from a **sublinear preferen...	03-29 08:01	Success	-	View
exp_2302.00129v1_20260307_071741 Paper: 2302.00129v1	Benchmark: Syntactic Topological Efficiency README.md Benchmark: Syntactic Topological Efficiency This benchmark investigates the "Universal Topological Regularities of Syntactic Structures." It tests the hypothesis that syntactic efficiency (minimized dependency length) can arise fr...	03-29 08:01	Success	-	View
exp_2302.00136v2_20260306_180733 Paper: 2302.00136v2	Benchmark: Differentiable Topological Loss (RTD) README.md Benchmark: Differentiable Topological Loss (RTD) Innovation Source: arXiv:2302.00136v2 Core Concept: Integration of Topological Data Analysis (TDA) directly into Deep Learning loss functions via Representation Topology Div...	03-29 08:01	Success	-	View
exp_2302.00136v2_20260307_053806 Paper: 2302.00136v2	RTD-AE: Representation Topology Divergence Autoencoder Benchmark README.md RTD-AE: Representation Topology Divergence Autoencoder Benchmark This benchmark evaluates the implementation of RTD-AE (Backfill Candidate 2302.00136v2), an autoencoder architecture constrained by a Representation Topology Div...	03-29 08:01	Success	-	View
exp_2302.00136v2_20260307_053923 Paper: 2302.00136v2	Here is the benchmark design for the RTD-AE (Representation Topology Divergence Autoencoder). README.md	03-29 08:01	Success	-	View
exp_2302.10800v1_20260307_072844 Paper: 2302.10800v1	Backfill Candidate 2302.10800v1 (KG-Hub Data Infrastructure) bash python benchmark.py	03-29 08:01	Success	-	View
exp_2303.01590v4_20260306_172454 Paper: 2303.01590v4	Here is the design for the benchmark. README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_2303.01610v1_20260306_174735 Paper: 2303.01610v1	Benchmark: Self-Slimmable Sparse Mixture of Experts (SMoE-Dropout) README.md Benchmark: Self-Slimmable Sparse Mixture of Experts (SMoE-Dropout) Overview This benchmark evaluates the SMoE-Dropout architecture (Candidate 2303.01610v1). The core innovation is the replacement of learned, complex routing po...	03-29 08:01	Success	-	View
exp_2304.00387v1_20260307_105418 Paper: 2304.00387v1	Benchmark: Backfill Candidate 2304.00387v1 (HaLP) Architecture: Introduces a lightweight augmentation-free contrastive learning framework. The HaLP module hallucinates synthetic positive samples directly in the latent space using a closed-form solver, replacing the need for complex geo...	03-29 08:01	Success	-	View
exp_2304.01222v1_20260307_155227 Paper: 2304.01222v1	Benchmark: NeuroDAVIS (Parametric Dimensionality Reduction) Architecture NeuroDAVIS employs an unsupervised deep neural network designed for dimensionality reduction. It extracts features non-linearly, theoretically preserving high-dimensional neighborhood relationships (local and global structu...	03-29 08:01	Success	-	View
exp_2306.00204v1_20260306_180653 Paper: 2306.00204v1	Benchmark for Directional Sharpness and Coordinate-wise Clipping README.md Benchmark for Directional Sharpness and Coordinate-wise Clipping Innovation Overview This benchmark evaluates the optimization technique Coordinate-wise Clipping proposed in the analysis of "Directional Sharpness". The Theory...	03-29 08:01	Success	-	View
exp_2306.01009v1_20260306_174523 Paper: 2306.01009v1	Section 1: README.md Benchmark: Scale vs. Reasoning Robustness Innovation: Backfill Candidate 2306.01009v1 Core Finding: Deductive reasoning in Transformer-Decoders is an emergent property of Scale. Larger models maintain reasoning robustness regardless...	03-29 08:01	Success	-	View
exp_2306.17848v1_20260307_105126 Paper: 2306.17848v1	Benchmark: Patch Mixing on CNNs (Backfill 2306.17848v1) README.md Benchmark: Patch Mixing on CNNs (Backfill 2306.17848v1) This benchmark evaluates the Patch Mixing augmentation strategy as applied to a standard ResNet-18 architecture. Patch Mixing is a training-time augmentation that randoml...	03-29 08:01	Success	-	View
exp_2307.00065v1_20260307_104653 Paper: 2307.00065v1	Benchmark: Dense Scene Interaction Prediction (Candidate 2307.00065v1) README.md Benchmark: Dense Scene Interaction Prediction (Candidate 2307.00065v1) Overview This benchmark validates the "Purely Data-Driven" approach described in Backfill Candidate 2307.00065v1. The abstract highlights that this model rel...	03-29 08:01	Success	-	View
exp_2307.00097v3_20260307_110516 Paper: 2307.00097v3	This benchmark evaluates the POLE (Prompt-only Learning) innovation, focusing on its proposed highly efficient memor... README.md This benchmark evaluates the POLE (Prompt-only Learning) innovation, focusing on its proposed highly efficient memory footprint and fast inference speed for Weakly Supervised Semantic Segmentation (WSSS). **Innovation Highligh...	03-29 08:01	Success	-	View
exp_2307.00112v2_20260307_153809 Paper: 2307.00112v2	Local Medical Domain Evaluation Benchmark README.md Local Medical Domain Evaluation Benchmark Overview This benchmark adapts the methodology of "Backfill Candidate 2307.00112v2" (evaluation of LLMs on medical exams) to a local, constrained environment. While the original paper...	03-29 08:01	Success	-	View
exp_2307.00119v1_20260307_104745 Paper: 2307.00119v1	Benchmark: Retrieval-Augmented Generation (RAG) with DPR README.md Benchmark: Retrieval-Augmented Generation (RAG) with DPR This benchmark evaluates the architecture described in 2307.00119v1, which proposes decoupling knowledge storage from model parameters. Architecture Overview Instead of...	03-29 08:01	Success	-	View
exp_2307.00149v1_20260306_172849 Paper: 2307.00149v1	HNC-CAD Architecture Benchmark README.md HNC-CAD Architecture Benchmark This benchmark evaluates the performance of the HNC-CAD (Hierarchical Neural Code for Computer-Aided Design) architecture. The core innovation involves decomposing CAD construction into a **3-lev...	03-29 08:01	Success	-	View
exp_2307.00149v1_20260307_094933 Paper: 2307.00149v1	Benchmark: Hierarchical VQ-VAE CAD Generation (ARES 8GB Optimization) README.md Benchmark: Hierarchical VQ-VAE CAD Generation (ARES 8GB Optimization) This repository contains a minimal, runnable benchmark to evaluate the performance and memory footprint of a Hierarchical VQ-VAE architecture coupled with Casca...	03-29 08:01	Success	-	View
exp_2307.00150v1_20260307_085733 Paper: 2307.00150v1	Benchmark: Local Automated Code Feedback (Backfill 2307.00150v1) README.md Benchmark: Local Automated Code Feedback (Backfill 2307.00150v1) Objective: This benchmark validates the feasibility of replacing the cloud-based GPT-3.5 API (described in the source paper) with a locally hosted, quantized Sma...	03-29 08:01	Success	-	View
exp_2307.00154v2_20260307_104903 Paper: 2307.00154v2	```markdown bash python benchmark.py	03-29 08:01	Success	-	View
exp_2307.00169v1_20260307_154831 Paper: 2307.00169v1	VoxWatch Benchmark Simulation README.md VoxWatch Benchmark Simulation This directory contains a lightweight simulation of the VoxWatch benchmark logic, designed to quantify the "False-Alarm Problem" in Open-Set Speaker Identification (OSI). The Innovation The core i...	03-29 08:01	Success	-	View
exp_2307.00171v1_20260307_154752 Paper: 2307.00171v1	Benchmark: NLP Inference via Integer Linear Programming (ILP) README.md Benchmark: NLP Inference via Integer Linear Programming (ILP) This benchmark evaluates the performance characteristics of NLP inference formulated as an Integer Linear Programming (ILP) problem, as discussed in the methodology of...	03-29 08:01	Success	-	View
exp_2307.00174v1_20260307_154614 Paper: 2307.00174v1	--- README.md --- Benchmark: Candidate 2307.00174v1 (Memory-Optimized Multimodal Segmentation) This benchmark evaluates a synthetic implementation of the architecture described in arXiv 2307.00174v1 ("Prior Prompt Encoder with Multimodal Fusion...	03-29 08:01	Success	-	View
exp_2308.15620v1_20260306_180946 Paper: 2308.15620v1	Here is the benchmark design for the "Fuzzy-Enhanced Hybrid Predictive System" (Backfill Candidate 2308.15620v1). This benchmark evaluates the throughput and memory footprint of the proposed Hybrid Intelligence Architecture compared to a traditional statistical baseline. --- README.md Benchmark: Fuzzy-Enhanced Hybrid Predictive System vs. Traditional M...	03-29 08:01	Success	-	View
exp_2309.16829v2_20260306_174101 Paper: 2309.16829v2	Benchmark: Derivative-Free Feynman-Kac PINN README.md Benchmark: Derivative-Free Feynman-Kac PINN This benchmark evaluates the performance differences between a standard Physics-Informed Neural Network (PINN) relying on Automatic Differentiation (AutoGrad) and the **Derivative-Fr...	03-29 08:01	Success	-	View
exp_2309.16870v1_20260306_170641 Paper: 2309.16870v1	Backfill Candidate 2309.16870v1: Recurrent Fusion Benchmark Architecture LEF proposes a recurrent "late-to-early" fusion scheme that injects object-aware latent embeddings into the early stages of a pillar-based detector. It processes temporally aligned sparse pillar tokens using window-based at...	03-29 08:01	Failed	GPU_REQUIRED policy blocked benchmark execution.	View
exp_2309.16898v1_20260306_172419 Paper: 2309.16898v1	Benchmark: Hybrid Edge-Cloud Pipeline for Humanoid Interaction (Candidate 2309.16898v1) README.md Benchmark: Hybrid Edge-Cloud Pipeline for Humanoid Interaction (Candidate 2309.16898v1) This benchmark evaluates the performance characteristics of a Hybrid Pipeline Architecture designed for resource-constrained humanoid plat...	03-29 08:01	Success	-	View
exp_2311.16339v1_20260306_172101 Paper: 2311.16339v1	Benchmark: Granular Event-Based Reward Shaping in RL README.md Benchmark: Granular Event-Based Reward Shaping in RL Overview This benchmark evaluates the impact of Granular, Event-Based Reward Shaping on Reinforcement Learning training efficiency. It contrasts a standard "Sparse Reward" s...	03-29 08:01	Success	-	View
exp_2312.16582v1_20260307_160847 Paper: 2312.16582v1	Here is the design for the Backfill Candidate 2312.16582v1 (Learnable Chamfer Distance) benchmark. This benchmark compares a standard Point Cloud Autoencoder using static Chamfer Distance against the same architecture augmented with the proposed Learnable Chamfer Distance (LCD) module. --- README.md Benchmark: Learnable Chamfer Dista...	03-29 08:01	Success	-	View
exp_2312.16600v1_20260307_161509 Paper: 2312.16600v1	Benchmark: CICL Architecture (Backfill Candidate 2312.16600v1) README.md Benchmark: CICL Architecture (Backfill Candidate 2312.16600v1) This benchmark evaluates the memory footprint and inference throughput of the Contrastive Instance-Consistent Learning (CICL) architecture applied to single-cell R...	03-29 08:01	Success	-	View
exp_2312.16610v1_20260307_124601 Paper: 2312.16610v1	Benchmark: Efficient MoFME vs. Standard MoE README.md Benchmark: Efficient MoFME vs. Standard MoE This benchmark evaluates the Efficient Deweather Mixture-of-Experts (MoFME) architecture against a Standard Mixture-of-Experts (MoE) baseline. The innovation in MoFME lies in rep...	03-29 08:01	Success	-	View
exp_2312.16623v1_20260307_104426 Paper: 2312.16623v1	This benchmark evaluates the memory footprint and inference latency of the architecture described in arXiv:2312.16623v1. README.md This benchmark evaluates the memory footprint and inference latency of the architecture described in arXiv:2312.16623v1. Innovation Summary: The paper proposes a BERT-based enhancement for Chinese Spelling Check (CSC) featurin...	03-29 08:01	Success	-	View
exp_2312.16627v1_20260307_124218 Paper: 2312.16627v1	Here is the runnable benchmark for MIM4DD. No summary available yet.	03-29 08:01	Success	-	View
exp_2312.16649v1_20260307_110434 Paper: 2312.16649v1	FatFormer (Backfill 2312.16649v1) Benchmark README.md FatFormer (Backfill 2312.16649v1) Benchmark This benchmark evaluates the FatFormer architecture, focusing on its efficiency in memory usage and throughput when employing "Forgery-aware Adapters" and frequency domain analysis o...	03-29 08:01	Success	-	View
exp_2312.16682v2_20260307_105556 Paper: 2312.16682v2	Benchmark for Backfill Candidate 2312.16682v2 README.md Benchmark for Backfill Candidate 2312.16682v2 Soft Margin Extension of the Binary Cringe Loss This benchmark is designed to verify the core claims of the proposed training objective: 1. Zero Inference Overhead: The method...	03-29 08:01	Success	-	View
exp_2312.16702v1_20260307_160805 Paper: 2312.16702v1	```markdown bash pip install torch transformers bash python benchmark.py	03-29 08:01	Success	-	View
exp_2312.16707v1_20260307_105644 Paper: 2312.16707v1	Section 1: README.md No summary available yet.	03-29 08:01	Success	-	View
exp_2312.16730v1_20260307_095021 Paper: 2312.16730v1	Benchmark: Theoretical RL & Bandit Function Approximation README.md Benchmark: Theoretical RL & Bandit Function Approximation This benchmark evaluates the fundamental concepts described in Backfill Candidate 2312.16730v1. Since the innovation is a theoretical survey of reinforcement learning a...	03-29 08:01	Success	-	View
exp_2312.16733v1_20260307_113021 Paper: 2312.16733v1	SuperServe Benchmark: SubNetAct & SlackFit README.md SuperServe Benchmark: SubNetAct & SlackFit This benchmark evaluates the SuperServe architecture, specifically the SubNetAct mechanism and SlackFit scheduling policy, as described in the research on "Fine-Grained Infere...	03-29 08:01	Success	-	View
exp_2312.17278v2_20260307_105019 Paper: 2312.17278v2	Based on the provided abstract, the "TAISR framework" is a methodological guide for applying existing LLMs to research w... We will benchmark the inference speed and VRAM usage of a standard model (`gpt2`) executing a "TAISR-style" complex prompting workflow (which involves context and role-playing) compared to a standard direct query. --- FILE_BREAK--- bash pip...	03-29 08:01	Success	-	View
exp_2312.17279v3_20260307_124533 Paper: 2312.17279v3	Here is the runnable benchmark for the Stateful Conformer with Cache-based Inference. This benchmark compares a Standard Buffered Conformer (Baseline) against the proposed Stateful Conformer with Cache (Innovation). It simulates a streaming scenario where audio is processed in chunks, highlighting the memory efficien...	03-29 08:01	Success	-	View
exp_2401.08664v3_20260307_095423 Paper: 2401.08664v3	This repository contains the benchmarking suite for Backfill Candidate 2401.08664v3. README.md This repository contains the benchmarking suite for Backfill Candidate 2401.08664v3. Context: As the associated document is a literature survey on Large Language Model (LLM) capabilities in education rather than a specific...	03-29 08:01	Success	-	View
exp_2401.15203v1_20260306_172724 Paper: 2401.15203v1	Benchmark: FedGT (Federated Graph Transformer) - Hybrid Attention Scheme README.md Benchmark: FedGT (Federated Graph Transformer) - Hybrid Attention Scheme This repository contains a minimal, self-contained benchmark to evaluate the performance characteristics of the FedGT (Federated Graph Transformer) archi...	03-29 08:01	Success	-	View
exp_2401.15236v2_20260306_180611 Paper: 2401.15236v2	Dual-Norse Adaptive Inference Benchmark README.md Dual-Norse Adaptive Inference Benchmark This benchmark simulates the "Dual-Norse" dynamic model-swapping architecture (Innovation 2401.15236v2). It demonstrates a hardware-constrained inference scenario (such as a nano-drone)...	03-29 08:01	Success	-	View
exp_2401.15238v1_20260306_173527 Paper: 2401.15238v1	Benchmark: Self-Supervised TabTransformer with Specialized Encoders README.md Benchmark: Self-Supervised TabTransformer with Specialized Encoders This benchmark evaluates the performance of a Self-Supervised TabTransformer implementing the specialized input encoding strategies (Binned-TT and MLP-based-T...	03-29 08:01	Success	-	View
exp_2402.16194v1_20260306_171132 Paper: 2402.16194v1	ASEM Architecture Benchmark README.md ASEM Architecture Benchmark This benchmark evaluates the performance characteristics of the ASEM (Emotion Analysis on top of Sentiment Analysis) architecture, specifically focusing on the **Mixture of Experts (Multiple Encoder...	03-29 08:01	Success	-	View
exp_2403.18128v1_20260306_172338 Paper: 2403.18128v1	This benchmark evaluates the performance characteristics of the HealthGAT architecture against a standard Transforme... Architecture: HealthGAT utilizes a hierarchical Graph Attention Network (GAT) architecture. It transforms raw Electronic Health Records (EHR) into a graph structure, employing iterative refinement layers to update medical code embedding...	03-29 08:01	Success	-	View
exp_2403.18159v2_20260306_173809 Paper: 2403.18159v2	Here is the runnable benchmark for the `ov-freeze` innovation. Architecture: Introduces ov-freeze, a lightweight Quantization-Aware Knowledge Distillation (KD-QAT) technique. It stabilizes the training of 4-bit weight quantized LLMs by addressing gradient propagation vulnerabilities identified...	03-29 08:01	Success	-	View
exp_2405.16312v2_20260306_180544 Paper: 2405.16312v2	```markdown README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_2405.16339v2_20260306_174141 Paper: 2405.16339v2	Section 1: README.md bash pip install torch python benchmark.py	03-29 08:01	Success	-	View
exp_2405.16363v2_20260306_172814 Paper: 2405.16363v2	Benchmarking Hierarchical Cluster-Constrained Control System (2405.16363v2) README.md Benchmarking Hierarchical Cluster-Constrained Control System (2405.16363v2) This repository contains a runnable, self-contained benchmark for the Hierarchical, Cluster-Constrained Control System architecture proposed in Backfi...	03-29 08:01	Success	-	View
exp_2406.17086v1_20260307_160651 Paper: 2406.17086v1	BrainMAE Efficiency Benchmark README.md BrainMAE Efficiency Benchmark This benchmark evaluates the architectural efficiency of the proposed BrainMAE model (Candidate 2406.17086v1). Innovation Summary BrainMAE proposes using a Masked Autoencoder (MAE) with a Graph At...	03-29 08:01	Success	-	View
exp_2406.17095v1_20260307_094135 Paper: 2406.17095v1	Backfill Candidate 2406.17095v1: Attention Directive Benchmark README.md Backfill Candidate 2406.17095v1: Attention Directive Benchmark Overview This benchmark evaluates the performance impact of Candidate 2406.17095v1, a non-invasive prompting technique designed to mitigate the "Lost-in-the-Middle...	03-29 08:01	Success	-	View
exp_2406.17115v3_20260307_161424 Paper: 2406.17115v3	HQH & HQM Benchmark Suite README.md HQH & HQM Benchmark Suite This repository contains a runnable benchmark for the HQH (Hallucination Questionnaire for Heterogeneity) dataset and the HQM (Hallucination Quality Metric) evaluation framework, as proposed in th...	03-29 08:01	Success	-	View
exp_2406.17119v2_20260307_154532 Paper: 2406.17119v2	Benchmark: U-AFNO (U-Net + Adaptive Fourier Neural Operator) README.md Benchmark: U-AFNO (U-Net + Adaptive Fourier Neural Operator) Candidate: 2406.17119v2 Innovation: Hybrid U-AFNO Architecture Abstract: This benchmark evaluates a hybrid architecture combining a U-Net backbone with a Vis...	03-29 08:01	Success	-	View
exp_2406.17126v2_20260307_085517 Paper: 2406.17126v2	```markdown README.md	03-29 08:01	Success	-	View
exp_2406.17148v2_20260307_084131 Paper: 2406.17148v2	MixTex Architecture Benchmark README.md MixTex Architecture Benchmark This benchmark evaluates the MixTex architecture as described in "Backfill Candidate 2406.17148v2". Architecture Overview: MixTex proposes a dual-transformer approach combining a **Swin Transf...	03-29 08:01	Success	-	View
exp_2406.17150v1_20260307_081622 Paper: 2406.17150v1	Here is the design for a runnable benchmark validating the sparse activation efficiency claims of Backfill Candidate 240... No summary available yet.	03-29 08:01	Success	-	View
exp_2406.17158v1_20260306_173925 Paper: 2406.17158v1	This benchmark is designed to evaluate the DEXTER innovation claims: specifically, the performance gap between stand... README.md This benchmark is designed to evaluate the DEXTER innovation claims: specifically, the performance gap between standard Dense Retrievers and Hybrid/Lexical approaches (like BM25 or Late Interaction) on complex, multi-hop Quest...	03-29 08:01	Success	-	View
exp_2406.17167v1_20260306_180518 Paper: 2406.17167v1	Section 1: README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_2406.17167v1_20260307_112833 Paper: 2406.17167v1	Benchmark: Low-Rank & Sparse Properties of One-Layer Transformers README.md Benchmark: Low-Rank & Sparse Properties of One-Layer Transformers This benchmark validates the theoretical findings from "Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysis." Theory Verification The pap...	03-29 08:01	Success	-	View
exp_2406.17168v1_20260307_160612 Paper: 2406.17168v1	Benchmark for Backfill Candidate 2406.17168v1 README.md Benchmark for Backfill Candidate 2406.17168v1 This benchmark evaluates the concurrent multi-task reinforcement learning with distillation architecture described in the paper "Backfill Candidate 2406.17168v1". Innovation Overvi...	03-29 08:01	Success	-	View
exp_2406.17184v2_20260306_172607 Paper: 2406.17184v2	Benchmark: Bias-Canceling UCB & Discretized Partitioning (Candidate 2406.17184v2) README.md Benchmark: Bias-Canceling UCB & Discretized Partitioning (Candidate 2406.17184v2) This benchmark evaluates the architectural innovation proposed in 2406.17184v2, which introduces a **Bias-Canceling Upper Confidence Bound (BC-UCB...	03-29 08:01	Success	-	View
exp_2406.17185v1_20260306_170040 Paper: 2406.17185v1	Section 1: README.md bash pip install numpy psutil bash python benchmark.py	03-29 08:01	Success	-	View
exp_2406.17185v1_20260307_081228 Paper: 2406.17185v1	Vaporetto Algorithm Simulation Benchmark README.md Vaporetto Algorithm Simulation Benchmark This benchmark demonstrates the performance characteristics of Vaporetto (Efficient Japanese Tokenization) using a pure Python simulation. Overview of the Innovation Vaporetto optimizes...	03-29 08:01	Success	-	View
exp_2406.17186v2_20260307_095516 Paper: 2406.17186v2	Benchmark: CLERC RAG Pipeline (Local Inference) README.md Benchmark: CLERC RAG Pipeline (Local Inference) This benchmark evaluates the Local Inference capabilities of the CLERC architecture, specifically testing the viability of replacing the cloud-based GPT-4o generator with a quant...	03-29 08:01	Success	-	View
exp_2407.09527v1_20260307_105513 Paper: 2407.09527v1	Benchmark: Median-Based 1.58-bit Quantization (Candidate 2407.09527v1) README.md Benchmark: Median-Based 1.58-bit Quantization (Candidate 2407.09527v1) Overview This benchmark validates the efficiency claims of the proposed "median-based" BitNet b1.58 variant. Specifically, it tests the hypothesis that a 1.58-...	03-29 08:01	Success	-	View
exp_2407.17642v1_20260306_171107 Paper: 2407.17642v1	SMA-Hyper Framework Benchmark README.md SMA-Hyper Framework Benchmark Innovation: Dynamic Dual Adaptive Spatiotemporal Learning with Hypergraphs Domain: Urban Risk Prediction (Spatiotemporal Forecasting) This benchmark evaluates the SMA-Hyper architecture, w...	03-29 08:01	Success	-	View
exp_2407.17671v2_20260306_173342 Paper: 2407.17671v2	Benchmark: UDI vs. Global Distillation README.md Benchmark: UDI vs. Global Distillation This benchmark evaluates the computational cost of the UDI (Unsqueezed Distillation-based SSL) architecture compared to a standard global compression baseline. Context Standard SSL method...	03-29 08:01	Success	-	View
exp_2407.20266v1_20260306_170503 Paper: 2407.20266v1	Section 1: README.md bash pip install torch bash python benchmark.py	03-29 08:01	Success	-	View
exp_2408.13352v1_20260306_174940 Paper: 2408.13352v1	Here is the design for the QAdaPrune benchmark. This benchmark simulates a Variational Quantum Circuit (VQC) training sc... README.md QAdaPrune: Adaptive Parameter Pruning Benchmark This benchmark evaluates the efficiency of QAdaPrune, an adaptive, hyperparameter-free pruning method for Variational Quantum Circuits (VQCs). Innovation Overview Standard VQCs s...	03-29 08:01	Success	-	View
exp_2409.05872v1_20260306_174451 Paper: 2409.05872v1	Here is the runnable benchmark design for the CSRec (Causal Sequential Recommendation) innovation. No summary available yet.	03-29 08:01	Success	-	View
exp_2410.17477v6_20260306_173236 Paper: 2410.17477v6	Architectural Benchmark: Transformer vs. Recurrent (RWKV) README.md Architectural Benchmark: Transformer vs. Recurrent (RWKV) Overview This benchmark validates the claims of Backfill Candidate 2410.17477v6, specifically the shift from self-attention (Transformer) to Recurrent Architectures (RW...	03-29 08:01	Success	-	View
exp_2410.19859v1_20260306_171635 Paper: 2410.19859v1	Benchmark for Backfill Candidate 2410.19859v1 README.md Benchmark for Backfill Candidate 2410.19859v1 Hierarchical Beam Selection (MMT + RL) This benchmark evaluates the performance of the proposed Hierarchical Two-Stage Beam Selection Framework. The system decouples the selection...	03-29 08:01	Success	-	View
exp_2411.14585v3_20260306_172149 Paper: 2411.14585v3	PointLCA-Net Benchmark README.md PointLCA-Net Benchmark This benchmark evaluates the PointLCA-Net architecture, a hybrid spatio-temporal processing system designed for edge neuromorphic computing. It combines the spatial feature extraction capabilities of **P...	03-29 08:01	Success	-	View
exp_2412.16715v1_20260307_083845 Paper: 2412.16715v1	Benchmarking CCFormer for WSI Analysis README.md Benchmarking CCFormer for WSI Analysis This benchmark evaluates the performance characteristics of CCFormer, an architecture designed to process Whole Slide Images (WSIs) as sparse point clouds of cells. Objective The primary...	03-29 08:01	Success	-	View
exp_2412.16738v1_20260307_102802 Paper: 2412.16738v1	KKAN (Kolmogorov-Arnold Network) Hybrid Benchmark README.md KKAN (Kolmogorov-Arnold Network) Hybrid Benchmark This benchmark evaluates the KKAN (Kolmogorov-Arnold Network) architecture, a hybrid design combining MLP-based inner functions with learnable outer basis functions. This struc...	03-29 08:01	Success	-	View
exp_2412.16739v2_20260307_113900 Paper: 2412.16739v2	```markdown bash python benchmark.py ``` Expected Output The benchmark will report VRAM usage and Throughput (Tokens/Samples per second) for both modes, demonstrating the speed and efficiency gains of the unrolled architecture.	03-29 08:01	Success	-	View
exp_2412.16745v2_20260307_103336 Paper: 2412.16745v2	Visual Mamba (ViM) Benchmark: Candidate 2412.16745v2 README.md Visual Mamba (ViM) Benchmark: Candidate 2412.16745v2 Overview This benchmark suite is designed to verify the performance claims of the Visual Mamba (ViM) architecture (arXiv 2412.16745v2). Specifically, it targets the innovati...	03-29 08:01	Success	-	View
exp_2412.16746v4_20260307_155146 Paper: 2412.16746v4	```markdown No summary available yet.	03-29 08:01	Success	-	View
exp_2412.16763v1_20260306_172630 Paper: 2412.16763v1	Benchmark Design for Paraformer (ClimSim Innovation) Innovation Summary Paraformer introduces a Transformer-based architecture to replace classical CNN/RNN methods in global climate model parameterization. It utilizes a "memory-aware" design to handle the large-scale ClimSim d...	03-29 08:01	Success	-	View
exp_2412.16763v1_20260307_112925 Paper: 2412.16763v1	Paraformer Benchmark: Climate Parameterization README.md Paraformer Benchmark: Climate Parameterization This benchmark evaluates the performance characteristics of Paraformer, a "memory-aware" Transformer model designed for climate parameterization using the ClimSim dataset. Overvie...	03-29 08:01	Success	-	View
exp_2412.16777v1_20260307_113938 Paper: 2412.16777v1	HyperCLIP Benchmark (Candidate 2412.16777v1) README.md HyperCLIP Benchmark (Candidate 2412.16777v1) This benchmark evaluates the HyperCLIP architecture, which replaces large static vision encoders with a text-conditioned hypernetwork. The goal is to validate the claim that this ar...	03-29 08:01	Success	-	View
exp_2412.16778v2_20260307_161356 Paper: 2412.16778v2	Benchmark: Candidate 2412.16778v2 (RoomPainter MVIS) README.md Benchmark: Candidate 2412.16778v2 (RoomPainter MVIS) Overview This benchmark evaluates the computational overhead and memory footprint associated with Candidate 2412.16778v2, specifically focusing on the **Attention-Guided Mul...	03-29 08:01	Success	-	View
exp_2412.16806v1_20260307_081941 Paper: 2412.16806v1	Benchmark for Quantum Contextuality Analysis in BERT README.md Benchmark for Quantum Contextuality Analysis in BERT This benchmark evaluates the computational overhead of applying Sheaf and Contextuality-by-Default (CbD) theoretical frameworks to standard BERT models, as proposed in Backf...	03-29 08:01	Success	-	View
exp_2412.18633v1_20260307_161325 Paper: 2412.18633v1	Benchmark: BoostMD Surrogate Acceleration README.md Benchmark: BoostMD Surrogate Acceleration This benchmark validates the BoostMD architecture proposal (Candidate 2412.18633v1), which focuses on accelerating Molecular Dynamics (MD) inference by minimizing atomic feature recalc...	03-29 08:01	Success	-	View
exp_2501.11733v2_20260306_180805 Paper: 2501.11733v2	Benchmark: Mobile-Agent-E Hierarchical Performance README.md Benchmark: Mobile-Agent-E Hierarchical Performance This benchmark evaluates the architectural efficiency of Mobile-Agent-E, a hierarchical multi-agent framework with persistent memory, against a standard flat agent architectur...	03-29 08:01	Success	-	View
exp_2501.11779v2_20260306_163150 Paper: 2501.11779v2	```markdown text pip install torch bash python benchmark.py ```	03-29 08:01	Success	-	View
exp_2502.15709v2_20260306_173159 Paper: 2502.15709v2	2502.15709v2 No summary available yet.	03-29 08:01	Success	-	View
exp_2504.14772v2_20260306_173107 Paper: 2504.14772v2	Here is the design for the benchmark. No summary available yet.	03-29 08:01	Success	-	View
exp_2505.14959v1_20260306_170011 Paper: 2505.14959v1	--- README.md Benchmark: Privacy-Preserving Collaborative CVR Training This benchmark evaluates the Privacy-Preserving Collaborative Training Framework for Conversion Rate (CVR) prediction. Innovation Overview The paper proposes a dual-laye...	03-29 08:01	Failed	RuntimeError: Expected all tensors to be on the same device, but got mat1 is on cuda:0, different from other tensors on cpu (when checking argument in method wrapper_CUDA_addmm)	View
exp_2505.14969v2_20260307_161029 Paper: 2505.14969v2	Here is the runnable benchmark designed for the innovation described in Backfill Candidate 2505.14969v2. README.md bash pip install torch python benchmark.py	03-29 08:01	Success	-	View
exp_2505.14970v4_20260306_173025 Paper: 2505.14970v4	Here is the runnable benchmark for the Self-Evolving Curriculum (SEC) innovation. README.md	03-29 08:01	Success	-	View
exp_2505.14972v2_20260306_171837 Paper: 2505.14972v2	Benchmark: CROSS Cultural Safety Alignment Framework README.md Benchmark: CROSS Cultural Safety Alignment Framework This repository contains a minimal, reproducible benchmark designed to evaluate the CROSS Cultural Safety Alignment Framework (Innovation Candidate 2505.14972v2). Overview o...	03-29 08:01	Success	-	View
exp_2505.14975v3_20260306_174819 Paper: 2505.14975v3	This benchmark evaluates the architectural efficiency of Backfill Candidate 2505.14975v3 ("Flat Policy via Bootstrap... README.md This benchmark evaluates the architectural efficiency of Backfill Candidate 2505.14975v3 ("Flat Policy via Bootstrapping"). The Innovation: Standard Hierarchical Reinforcement Learning (HRL) relies on a "Manager" (High-Lev...	03-29 08:01	Success	-	View
exp_2506.16552v3_20260307_100154 Paper: 2506.16552v3	Benchmark: Revela-Style Dense Retriever Learning Architecture: Revela employs a standard dense dual-encoder architecture (Bi-Encoder). It integrates retriever optimization into Language Modeling (LM) training by using retriever-computed similarity scores to weight an in-batch cross-do...	03-29 08:01	Success	-	View
exp_2506.16571v2_20260307_161251 Paper: 2506.16571v2	Here is the benchmark design for the "Backfill Candidate 2506.16571v2" innovation, focusing on the feasibility of proces... Paper Analysis: Capturing Visualization Design Rationale This paper introduces a methodology and dataset for extracting visualization design rationales from student notebooks, creating a corpus of Question-Answer-Rationale triples usi...	03-29 08:01	Success	-	View
exp_2506.16575v1_20260307_154445 Paper: 2506.16575v1	Benchmark: Elo-Based Multi-Candidate Aggregation (ARES) Paper Summary: Elo Rating System for Harmful Content Detection Architecture: The paper proposes an inference workflow utilizing an Elo rating system to rank and select optimal LLM responses for detecting harmful content (microaggres...	03-29 08:01	Success	-	View
exp_2506.16580v1_20260307_100237 Paper: 2506.16580v1	Here is the design for the runnable benchmark targeting the Emformer + NAR architecture candidate. Architecture: Replaces standard encoder blocks with an Emformer (Efficient Memory Transformer) to enable chunk-based attention and streamable processing. The model utilizes a non-autoregressive decoder to parallelize output generati...	03-29 08:01	Success	-	View
exp_2506.16584v1_20260307_083216 Paper: 2506.16584v1	Benchmark for Variance Decomposition (Semantic Grounding) Architecture & Methodology This paper does not propose a new model architecture. Instead, it introduces a Variance Decomposition Framework, an evaluation methodology designed to measure semantic grounding. It assesses whether an LLM...	03-29 08:01	Success	-	View
exp_2506.16586v1_20260307_100312 Paper: 2506.16586v1	Benchmark: Agentic QA Workflow Efficiency (Backfill 2506.16586v1) Assessment: This paper evaluates a workflow rather than a specific model architecture. It focuses on applying generic "state-of-the-art" LLMs to QA tasks. * Architecture: Utilizes AI-agents for automated test case generation, stat...	03-29 08:01	Success	-	View
exp_2506.16592v1_20260307_090447 Paper: 2506.16592v1	Backfill Candidate 2506.16592v1: Architecture Benchmark Architecture: Utilizes a hybrid design coupling a pre-trained DenseNet121 encoder with a multi-branch attention-enhanced decoder. The bottleneck employs Global Spatial Attention (GSA), Position Encoding, and Scaled Dot-Product Attention...	03-29 08:01	Success	-	View
exp_2506.16593v1_20260307_113618 Paper: 2506.16593v1	Benchmark: Slip-Steer Kinematics & DRIVE Protocol (Candidate 2506.16593v1) Summary for ARES 8GB Roadmap Focus: Physical System Identification & Uncertainty Quantification (Classical/Model-based, not Deep Learning). * Architecture: Proposes a lightweight mathematical "transfer function" linking velocity...	03-29 08:01	Success	-	View
exp_2506.16594v2_20260307_113657 Paper: 2506.16594v2	Benchmark: LLM Biomedical Synthetic Data Generation (Scoping Review 2506.16594v2) This paper is a scoping review, not a technical architecture proposal. Consequently, it provides no specific data regarding model architecture, memory footprint, or inference speed required for the ARES 8GB roadmap. * **Architecture...	03-29 08:01	Success	-	View
exp_2506.16596v3_20260307_104214 Paper: 2506.16596v3	```markdown This paper outlines a community-driven vision for a modern Cyc-like knowledge infrastructure to address LLM hallucinations and reasoning gaps. * Architecture: Proposes an "open engineering framework" integrating modular Knowledge Repres...	03-29 08:01	Success	-	View
exp_2506.16597v1_20260307_113747 Paper: 2506.16597v1	Benchmark: Vision Transformer (ViT) on Recurrence Plots for Exoplanet Classification Paper: Exoplanet Classification through Vision Transformers with Temporal Image Analysis Architecture: The proposed pipeline converts 1D Kepler light curves into 2D Recurrence Plots (RPs) or Gramian Angular Fields (GAFs) to serve as...	03-29 08:01	Success	-	View
exp_2506.16600v2_20260306_165933 Paper: 2506.16600v2	FLAME Architecture Benchmark: Dynamic Sparse Activation FLAME proposes a Sparse Mixture-of-Experts (SMoE) framework for federated LLM fine-tuning, designed to eliminate the performance degradation caused by compressing LoRA matrices on low-resource clients. * Architecture: Replaces stand...	03-29 08:01	Success	-	View
exp_2506.16600v2_20260307_080909 Paper: 2506.16600v2	Here is the design for the FLAME benchmark, focusing on the core innovation: enabling resource-adaptive federated learni... FLAME proposes a Sparse Mixture-of-Experts (SMoE) framework for federated LLM fine-tuning, designed to eliminate the performance degradation caused by compressing LoRA matrices on low-resource clients. * Architecture: Replaces stand...	03-29 08:01	Success	-	View
exp_2506.16617v1_20260306_174407 Paper: 2506.16617v1	thoughts: 1. Analyze the Request: * Input: Title "Backfill Candidate 2506.16617v1", Abstract about "Human-Centric Evaluation Framework" for XAI in PPM. * Constraints: Output `README.md` and `benchmark.py` separated by `	03-29 08:01	Success	-	View
exp_2506.16623v1_20260307_100441 Paper: 2506.16623v1	Section 1: README.md Architecture The framework utilizes a frontier-based exploration strategy guided by a Vision-Language Model (VLM). Instead of simple embedding similarity, it employs dynamic history-augmented prompting. The system injects a text...	03-29 08:01	Success	-	View
exp_2506.16628v1_20260307_105849 Paper: 2506.16628v1	--- Architecture: Hybrid offline design. LLMs are utilized exclusively during the development phase to generate rules, identify relevant text snippets, and extract keywords. The production system is a traditional rule-based NLP pipeline (Re...	03-29 08:01	Success	-	View
exp_2506.16633v2_20260307_083805 Paper: 2506.16633v2	Section 1: README.md Paper: GeoGuess (SightSense) Summary for ARES 8GB Roadmap: * Architecture: Proposes SightSense, a multimodal framework processing Street View panoramas. It employs a hierarchical visual encoder to synthesize local de...	03-29 08:01	Success	-	View
exp_2506.16636v1_20260307_154250 Paper: 2506.16636v1	README.md Architecture The method relies on Masked Autoregressive Flows (MAF). Rather than standard generative sampling, it proposes a "Latent Noise Injection" (LNI) technique: encoding specific observed data points into the latent space, app...	03-29 08:01	Success	-	View
exp_2506.16640v4_20260306_170755 Paper: 2506.16640v4	Here is the runnable benchmark code for the Adaptive-Scalable Entmax (ASEntmax) innovation. Architecture Proposes Adaptive-Scalable Entmax (ASEntmax), a drop-in replacement for Softmax attention. It utilizes $\alpha$-entmax to assign exact zeros to irrelevant tokens, creating dynamically sparse attention maps. A learnable...	03-29 08:01	Success	-	View
exp_2506.16640v4_20260307_080835 Paper: 2506.16640v4	Section 1: README.md Architecture Proposes Adaptive-Scalable Entmax (ASEntmax), a drop-in replacement for Softmax attention. It utilizes $\alpha$-entmax to assign exact zeros to irrelevant tokens, creating dynamically sparse attention maps. A learnable...	03-29 08:01	Success	-	View
exp_2506.16644v1_20260307_100709 Paper: 2506.16644v1	SORE Architecture Benchmark Architecture SORE replaces autoregressive LLMs with a dual-stage pipeline utilizing multilingual sentence encoders and Approximate Nearest Neighbor (ANN) search. It identifies core content via metadata embeddings and filters extraneous...	03-29 08:01	Success	-	View
exp_2506.16650v1_20260306_174212 Paper: 2506.16650v1	SemAgent: Semantic-Driven Two-Stage Benchmark Architecture: Proposes a complex, multi-stage agentic workflow. It moves beyond simple code localization by integrating execution semantics for context retrieval and generalized abstraction for issue understanding. The core uses...	03-29 08:01	Success	-	View
exp_2506.16650v1_20260307_102357 Paper: 2506.16650v1	SemAgent Pipeline Benchmark (8GB Constraint) Architecture: Proposes a complex, multi-stage agentic workflow. It moves beyond simple code localization by integrating execution semantics for context retrieval and generalized abstraction for issue understanding. The core uses...	03-29 08:01	Success	-	View
exp_2506.16655v1_20260307_102631 Paper: 2506.16655v1	Backfill Candidate 2506.16655v1 Benchmark Architecture Arch-Router is a compact 1.5B parameter model functioning as a classifier. Instead of generating text, it maps user queries to specific domains (e.g., travel) or action types to select the most appropriate downstream model...	03-29 08:01	Success	-	View
exp_2507.14722v1_20260306_155934 Paper: 2507.14722v1	This benchmark simulates the core innovation of the LeanTree methodology for Automated Theorem Proving (ATP) as desc... README.md This benchmark simulates the core innovation of the LeanTree methodology for Automated Theorem Proving (ATP) as described in the analysis. The Innovation: LeanTree proposes a "White-Box" approach that factorizes complex pr...	03-29 08:01	Success	-	View
exp_2507.14757v1_20260306_180835 Paper: 2507.14757v1	Section 1: README.md Operational Manifold SNN Benchmark Overview This benchmark validates the "Operational Manifold" design principle for Spiking Neural Networks (SNNs). It demonstrates that SNN performance (measured here as network viability and spike thro...	03-29 08:01	Success	-	View
exp_2507.14758v1_20260306_165905 Paper: 2507.14758v1	GRACE Framework Benchmark This benchmark evaluates the GRACE (Generative Recommendation via Chain-of-Thought) framework concepts. What is being tested? 1. Hybrid CoT Tokenization: Instead of predicting the next Item ID directly, the model interprets and gene...	03-29 08:01	Failed	TypeError: int is not a Module subclass	View
exp_2507.14766v1_20260306_172220 Paper: 2507.14766v1	Design Reasoning To benchmark the innovation described (CXR-TFT), we need to simulate the computational cost of the Multi-Modal Temporal Fusion architecture. Core Architecture to Simulate: 1. Sparse-Dense Alignment: The unique computational load...	03-29 08:01	Success	-	View
exp_2507.14768v2_20260306_172920 Paper: 2507.14768v2	Benchmark: Heterogeneous Hierarchical Secure Aggregation (H-HSA) README.md Benchmark: Heterogeneous Hierarchical Secure Aggregation (H-HSA) This benchmark evaluates the computational and memory efficiency gains of the Heterogeneous Hierarchical Secure Aggregation (H-HSA) innovation compared to standa...	03-29 08:01	Success	-	View
exp_2508.06495v1_20260306_155746 Paper: 2508.06495v1	Benchmark: Local Fact-Checking Data Pipeline (Backfill Candidate 2508.06495v1) README.md Benchmark: Local Fact-Checking Data Pipeline (Backfill Candidate 2508.06495v1) Overview This benchmark evaluates the feasibility of running the "Claim Extraction" phase of the Portuguese fact-checking pipeline (as described in...	03-29 08:01	Success	-	View
exp_2508.13337v1_20260306_163845 Paper: 2508.13337v1	Benchmark: X-MoE Inspired Padding-Free & Sparse Execution README.md Benchmark: X-MoE Inspired Padding-Free & Sparse Execution Overview This benchmark evaluates the memory efficiency gains derived from the X-MoE (Padding-Free Execution) and Redundancy-Bypassing Dispatch principles, specific...	03-29 08:01	Success	-	View
exp_2508.13346v1_20260306_155642 Paper: 2508.13346v1	Backfill Candidate 2508.13346v1: Barron Bounds & Linear Efficiency README.md Backfill Candidate 2508.13346v1: Barron Bounds & Linear Efficiency Overview This benchmark validates the theoretical limits of function approximation for linear methods on hardware with constrained VRAM (RTX A2000 8GB target). Bas...	03-29 08:01	Failed	RuntimeError: The size of tensor a (128) must match the size of tensor b (8) at non-singleton dimension 1	View
exp_2508.13358v1_20260306_153101 Paper: 2508.13358v1	Benchmark: Dynamic Beam Pruning for LLM Text Generation README.md Benchmark: Dynamic Beam Pruning for LLM Text Generation 1. Context & Relevance This benchmark evaluates the transfer of "Aggressive Beam Search Pruning" logic (originally proposed for ASR/MT streaming in the candidate paper) t...	03-29 08:01	Success	-	View
exp_2508.13364v1_20260306_162901 Paper: 2508.13364v1	HAL 9000 Risk Prediction Benchmark README.md HAL 9000 Risk Prediction Benchmark This benchmark evaluates the machine learning component of the HAL 9000 system, simulating the processing of scraped vulnerability data to predict exploitability. Context: The underlying...	03-29 08:01	Success	-	View
exp_2508.13376v1_20260306_171223 Paper: 2508.13376v1	Innovation: Semantic-Enhanced ASR via LLaMA Distillation README.md Innovation: Semantic-Enhanced ASR via LLaMA Distillation Candidate: Backfill 2508.13376v1 Target Hardware: RTX A2000 (8GB VRAM) Focus: Cross-modal Distillation & Memory Efficiency Summary This benchmark validates the f...	03-29 08:01	Success	-	View
exp_2508.13380v1_20260306_155459 Paper: 2508.13380v1	J3O: Joint Optimization of Onloading & Offloading Benchmark README.md J3O: Joint Optimization of Onloading & Offloading Benchmark This benchmark validates the J3O (Joint Optimization) innovation for constrained hardware environments (Target: <8GB VRAM, e.g., RTX A2000). The Innovation Traditiona...	03-29 08:01	Failed	RuntimeError: Expected all tensors to be on the same device, but got weight is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA__native	View
exp_2508.14125v1_20260306_133835 Paper: 2508.14125v1	Benchmark: Smart Parking Prediction (Sensor-Free Framework) README.md Benchmark: Smart Parking Prediction (Sensor-Free Framework) Overview This benchmark evaluates the Smart Parking Prediction Framework proposed in Backfill Candidate 2508.14125v1. The original paper proposes a "sensor-free" ap...	03-29 08:01	Success	-	View
exp_2508.14125v1_20260306_152000 Paper: 2508.14125v1	Section 1: README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_2508.14125v1_20260306_152219 Paper: 2508.14125v1	```markdown bash pip install torch numpy scikit-learn pandas bash python benchmark.py ``` The script will output performance metrics, VRAM usage, and the final verification of the hypothesis (RFR vs LSTM).	03-29 08:01	Failed	RuntimeError: Found dtype Double but expected Float	View
exp_2508.15831v2_20260306_162734 Paper: 2508.15831v2	2508.15831v2 No summary available yet.	03-29 08:01	Success	-	View
exp_2509.14438v1_20260306_134635 Paper: 2509.14438v1	Benchmark: Bias Mitigation Overhead Analysis (Candidate 2509.14438v1) README.md Benchmark: Bias Mitigation Overhead Analysis (Candidate 2509.14438v1) This benchmark evaluates the computational cost and efficacy of the bias mitigation strategies proposed in the ARES Analysis Log. Objective The source p...	03-29 08:01	Success	-	View
exp_2509.14448v1_20260306_152446 Paper: 2509.14448v1	VCBench: Lightweight Founder Success Prediction (Replica) README.md VCBench: Lightweight Founder Success Prediction (Replica) This benchmark is a runnable, lightweight replication of the VCBench evaluation framework (Target: Backfill Candidate 2509.14448v1). Context: The original VCBench p...	03-29 08:01	Success	-	View
exp_2509.14456v2_20260306_152929 Paper: 2509.14456v2	Benchmark: CORRECT-DETECT Trade-off Analysis (Candidate 2509.14456v2) README.md Benchmark: CORRECT-DETECT Trade-off Analysis (Candidate 2509.14456v2) This benchmark evaluates the CORRECT-DETECT cognitive bottleneck identified in the candidate paper. The Innovation The paper argues that standard LLMs suffe...	03-29 08:01	Success	-	View
exp_2509.14480v1_20260306_163542 Paper: 2509.14480v1	Benchmark: Turn-level Adjudicated Reinforcement Learning (TARL) README.md Benchmark: Turn-level Adjudicated Reinforcement Learning (TARL) This benchmark suite evaluates the computational efficiency and memory footprint of the Turn-level Adjudicated Reinforcement Learning (TARL) protocol proposed in...	03-29 08:01	Failed	RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x32 and 128x1)	View
exp_2509.14488v1_20260306_163044 Paper: 2509.14488v1	Thought Process 1. Analyze the Innovation: The paper "ARES Analysis" proposes replacing global synchronization (expensive, scales linearly $O(m)$) with randomized local coordination (constant time $O(1)$). In the context of an RTX A2000 8GB (Bandwidth...	03-29 08:01	Success	-	View
exp_2509.16256v1_20260306_155721 Paper: 2509.16256v1	Benchmark: HausaMovieReview Innovation (Low-Data Efficiency) README.md Benchmark: HausaMovieReview Innovation (Low-Data Efficiency) Candidate: Backfill Candidate 2509.16256v1 (HausaMovieReview) Verdict: REJECTED FOR CODING (Directive 11) Objective: Verify the paper's claim that Classical...	03-29 08:01	Failed	ZeroDivisionError: float division by zero	View
exp_2509.18178v2_20260306_115349 Paper: 2509.18178v2	Benchmark: Multi-Agent Workflow Orchestration (Foam-Agent Pattern) README.md Benchmark: Multi-Agent Workflow Orchestration (Foam-Agent Pattern) Overview This benchmark validates the architectural efficiency of the multi-agent pattern described in the "Foam-Agent" paper (Backfill Candidate 2509.18178v2)...	03-29 08:01	Success	-	View
exp_2510.16197v1_20260306_162952 Paper: 2510.16197v1	Benchmark: LaSDI-Inference (Latent Space Dynamics Identification) README.md Benchmark: LaSDI-Inference (Latent Space Dynamics Identification) This benchmark evaluates the computational efficiency and memory footprint of the LaSDI (Latent Space Dynamics Identification) framework when applied to hig...	03-29 08:01	Success	-	View
exp_2510.16198v1_20260306_153211 Paper: 2510.16198v1	ARES Protocol Benchmark: EgMM-Corpus & CLIP Evaluation README.md ARES Protocol Benchmark: EgMM-Corpus & CLIP Evaluation Overview This benchmark evaluates the computational requirements and processing throughput of standard CLIP models (specifically `openai/clip-vit-base-patch32`) when subjected...	03-29 08:01	Success	-	View
exp_2510.16208v1_20260306_153937 Paper: 2510.16208v1	Backfill Candidate 2510.16208v1: Nonstationary Bandits with Linear Dynamics Benchmark README.md Backfill Candidate 2510.16208v1: Nonstationary Bandits with Linear Dynamics Benchmark This benchmark evaluates the computational efficiency of the Explore-Then-Commit strategy applied to Linear Dynamical Systems (LDS) as descr...	03-29 08:01	Success	-	View
exp_2510.16232v2_20260306_164034 Paper: 2510.16232v2	Section 1: README.md Benchmark: AffPCL (Affinity-based Personalized Collaborative Learning) on 8GB VRAM Innovation Overview This benchmark validates the ARES Analysis: AffPCL & The 8GB Efficiency Frontier. It translates the theoretical "AffPCL" framework (t...	03-29 08:01	Failed	ModuleNotFoundError: No module named 'peft'	View
exp_2510.16250v1_20260306_162842 Paper: 2510.16250v1	Here is the design for the benchmark based on the provided innovation analysis. README.md Benchmark: 1-Bit Weight Quantization (ARES Candidate 2510.16250v1) Overview This benchmark evaluates the memory and performance efficiency of the 1-Bit Weight Quantization technique applied to Random Features/MLP architectures...	03-29 08:01	Failed	NotImplementedError: Module [StandardModel] is missing the required "forward" function	View
exp_2510.16252v1_20260306_152359 Paper: 2510.16252v1	WEBSERV Input Efficiency Benchmark README.md WEBSERV Input Efficiency Benchmark This benchmark evaluates the WEBSERV innovation proposal (Backfill Candidate 2510.16252v1). Context & Goal Modern Web Agents face an Input Bottleneck. Raw browser environments inject mass...	03-29 08:01	Failed	GPU_REQUIRED policy blocked benchmark execution.	View
exp_2510.17881v2_20260306_152814 Paper: 2510.17881v2	POPI: Modular Personalization Benchmark README.md POPI: Modular Personalization Benchmark Overview This benchmark evaluates the POPI (Modular Personalization via Preference Inference) innovation, specifically targeting the 8GB VRAM Efficiency Frontier. The core hypothesis...	03-29 08:01	Failed	torch.AcceleratorError: CUDA error: device-side assert triggered	View
exp_2511.12791v3_20260306_134323 Paper: 2511.12791v3	Benchmark: Adaptive Horizon Selection (Backfill Candidate 2511.12791v3) README.md Benchmark: Adaptive Horizon Selection (Backfill Candidate 2511.12791v3) Objective This benchmark validates the "Dynamic Context Pruning" innovation for the RTX A2000 (8GB VRAM) architecture. It tests the hypothesis that we can def...	03-29 08:01	Success	-	View
exp_2511.12797v2_20260306_155856 Paper: 2511.12797v2	Benchmark: Modality-Agnostic Symbolic Reasoning (Evo2 Insight) README.md Benchmark: Modality-Agnostic Symbolic Reasoning (Evo2 Insight) Overview This benchmark validates the hypothesis proposed in Backfill Candidate 2511.12797v2 (Evo2): that **In-Context Learning (ICL) and symbolic reasoning capabi...	03-29 08:01	Success	-	View
exp_2511.12805v1_20260306_155603 Paper: 2511.12805v1	Benchmark: Sign-augmented Structural Intervention Distance (sSID) README.md Benchmark: Sign-augmented Structural Intervention Distance (sSID) Overview This benchmark evaluates the computational performance of the sign-augmented Structural Intervention Distance (sSID) algorithm as described in Backfill...	03-29 08:01	Success	-	View
exp_2511.12808v4_20260306_152739 Paper: 2511.12808v4	Section 1: README.md LTLf Dense Reward Benchmark (Backfill 2511.12808v4) Overview This benchmark evaluates the computational overhead of Quantitative Linear Temporal Logic ($\text{LTL}_f$) for Reward Shaping in Reinforcement Learning. The innovation replace...	03-29 08:01	Success	-	View
exp_2511.12810v1_20260306_162803 Paper: 2511.12810v1	MSRNet-Inspired Efficiency Benchmark README.md MSRNet-Inspired Efficiency Benchmark This benchmark evaluates the memory efficiency claims derived from the MSRNet (Multi-Scale Refinement) analysis, specifically comparing Stacked Architectures against **Recursive Refinement...	03-29 08:01	Success	-	View
exp_2511.12817v2_20260306_134602 Paper: 2511.12817v2	Here is the runnable benchmark for the FAITH (Knowledge Graph Grounded Evaluation) innovation. README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_2511.12827v1_20260306_133748 Paper: 2511.12827v1	ARES Backfill Candidate 2511.12827v1 README.md ARES Backfill Candidate 2511.12827v1 Confidence-Adaptive Bit-Depth Reduction (CABDR) Benchmark Subject: Cross-Domain Innovation Transfer (Malware Defense -> Post-Transformer Inference) Target Hardware: RTX A2000 (8GB VRAM)...	03-29 08:01	Success	-	View
exp_2511.12827v1_20260306_134505 Paper: 2511.12827v1	Here is the runnable benchmark for the Confidence-Adaptive Bit-Depth Reduction (CABDR) innovation. This benchmark simulates the core "ARES" objective: maximizing inference efficiency on hardware-constrained edge devices (simulated here via dynamic precision switching). It compares a static high-precision model against a dynamic model tha...	03-29 08:01	Success	-	View
exp_2511.12836v1_20260306_155429 Paper: 2511.12836v1	Benchmark: DIGing-SGLD (Decentralized Sampling) README.md Benchmark: DIGing-SGLD (Decentralized Sampling) Overview This benchmark implements the DIGing-SGLD algorithm as described in "Backfill Candidate 2511.12836v1". Note: This innovation focuses on *Bayesian Training/Sampling...	03-29 08:01	Success	-	View
exp_2511.12838v1_20260306_163712 Paper: 2511.12838v1	Co-Sparsify Benchmark README.md Co-Sparsify Benchmark This benchmark evaluates the Co-Sparsify topology-aware sparsification technique for Higher-order Graph Neural Networks (HOGNNs). The Innovation Standard HOGNN layers (2-FWL) require cubic complexity $O(N...	03-29 08:01	Success	-	View
exp_2512.14856v2_20260307_125012 Paper: 2512.14856v2	Benchmark: T5Gemma 2 (Encoder-Decoder) Memory & Throughput Architecture: T5Gemma 2 repurposes the decoder-only Gemma 3 into an encoder-decoder architecture via UL2 adaptation, specifically optimized for multimodal and long-context tasks. Memory Footprint: The model prioritizes VRAM effi...	03-29 08:01	Success	-	View
exp_2512.14865v1_20260307_124833 Paper: 2512.14865v1	Audio MultiChallenge Benchmark Paper: Audio MultiChallenge (Benchmark) Architecture & Scope: This paper introduces Audio MultiChallenge, a benchmark for End-to-End (E2E) Spoken Dialogue Systems (SDS) that process raw audio without intermediate transcription....	03-29 08:01	Success	-	View
exp_2512.14870v1_20260307_081406 Paper: 2512.14870v1	Benchmark: ARES Architecture (HERBench Simulation) HERBench introduces a high-complexity VideoQA benchmark requiring the aggregation of at least three temporally separated visual cues. It utilizes a Minimum Required Frame-Set (MRFS) metric averaging 5.5 frames, significantly higher than...	03-29 08:01	Success	-	View
exp_2512.14879v1_20260307_155056 Paper: 2512.14879v1	Here is the runnable benchmark for the Entropy-Reservoir Bregman Projection (ERBP) innovation. Architecture: Proposes Entropy-Reservoir Bregman Projection (ERBP), a theoretical framework for self-referential training. It addresses model collapse via information geometry rather than proposing a new hardware-efficient model archite...	03-29 08:01	Success	-	View
exp_2512.14880v1_20260307_105725 Paper: 2512.14880v1	```markdown Architecture: Introduces "Task Matrices"—linear transformations that map base model embeddings to specific finetuned states. This allows a single base model to simulate the behavior of multiple specialized models by applying distinct li...	03-29 08:01	Success	-	View
exp_2512.14896v1_20260307_104007 Paper: 2512.14896v1	Benchmark: External RAG Pipeline (Backfill Candidate 2512.14896v1) Architecture DrugRAG is a model-agnostic, three-step Retrieval-Augmented Generation (RAG) pipeline. It functions as an external wrapper, retrieving structured drug knowledge to augment prompts without modifying the underlying LLM archit...	03-29 08:01	Success	-	View
exp_2512.14908v5_20260307_113120 Paper: 2512.14908v5	Benchmark: ATLAS (Adjacency-Free Inference) vs. Traditional GNN Architecture: ATLAS is a propagation-free framework replacing message passing with multi-resolution community features. It utilizes modularity-guided search to identify optimal community scales, projects these structures into embeddings...	03-29 08:01	Success	-	View
exp_2512.14910v1_20260306_115659 Paper: 2512.14910v1	Here is the design for the benchmark based on the "AgroAskAI" analysis and the required "Strategic Pivot" to fit 8GB VRA... README.md AgroAskAI: Efficiency & VRAM Constraint Benchmark 1. Context & Strategic Pivot (Step 11) The original AgroAskAI proposal suggests a Multi-Agent System (MAS) using a Chain-of-Responsibility (Router -> Specialist -> Synthesizer)...	03-29 08:01	Success	-	View
exp_2512.14925v2_20260306_140457 Paper: 2512.14925v2	Here is a runnable benchmark suite designed to validate the VRAM efficiency claims of the MAHA proposal. Architecture: MAHA replaces standard MHSA with a hybrid dilated-convolutional transformer backbone. It utilizes learnable downsampling to partition inputs into hierarchical scales and aggregates attention maps using differentiable conve...	03-29 08:01	Success	-	View
exp_2512.14925v2_20260307_081020 Paper: 2512.14925v2	Here is the runnable benchmark code for the Multiscale Aggregated Hierarchical Attention (MAHA) innovation. Architecture: MAHA replaces standard MHSA with a hybrid dilated-convolutional transformer backbone. It utilizes learnable downsampling to partition inputs into hierarchical scales and aggregates attention maps using differentiable conve...	03-29 08:01	Success	-	View
exp_2512.14930v1_20260306_140625 Paper: 2512.14930v1	RMPMAB-Inspired KV-Cache Eviction Benchmark Architecture: Proposes a Restless Multi-Process Multi-Armed Bandit (RMPMAB) framework. Instead of deep neural networks, it models imaging regions as ensembles of Markov chains to capture biological heterogeneity. It relies on scalable W...	03-29 08:01	Success	-	View
exp_2512.14930v1_20260307_124800 Paper: 2512.14930v1	--- Architecture: Proposes a Restless Multi-Process Multi-Armed Bandit (RMPMAB) framework. Instead of deep neural networks, it models imaging regions as ensembles of Markov chains to capture biological heterogeneity. It relies on scalable W...	03-29 08:01	Success	-	View
exp_2512.14938v1_20260306_134152 Paper: 2512.14938v1	Here is the design for the benchmark, simulating the specific memory efficiencies claimed by the TalkVerse architecture... Architecture The model utilizes a 5B parameter Diffusion Transformer (DiT) built upon Wan2.2. To manage long-form generation, it employs a sliding window mechanism with motion-frame context and a high-compression Video VAE. **Memory Foo...	03-29 08:01	Success	-	View
exp_2512.14938v1_20260306_134950 Paper: 2512.14938v1	Here is the runnable benchmark designed to test the architectural claims of the "TalkVerse" innovation (Sliding Window A... Architecture The model utilizes a 5B parameter Diffusion Transformer (DiT) built upon Wan2.2. To manage long-form generation, it employs a sliding window mechanism with motion-frame context and a high-compression Video VAE. **Memory Foo...	03-29 08:01	Success	-	View
exp_2512.14938v1_20260306_152312 Paper: 2512.14938v1	Benchmark: TalkVerse Efficiency Simulation (Linear Attention + High Compression) Architecture The model utilizes a 5B parameter Diffusion Transformer (DiT) built upon Wan2.2. To manage long-form generation, it employs a sliding window mechanism with motion-frame context and a high-compression Video VAE. **Memory Foo...	03-29 08:01	Success	-	View
exp_2512.14938v1_20260307_154128 Paper: 2512.14938v1	Wan2.2-5B Video Generation Benchmark Architecture The model utilizes a 5B parameter Diffusion Transformer (DiT) built upon Wan2.2. To manage long-form generation, it employs a sliding window mechanism with motion-frame context and a high-compression Video VAE. **Memory Foo...	03-29 08:01	Success	-	View
exp_2512.14941v1_20260306_153138 Paper: 2512.14941v1	Benchmark: Physics-Informed Neural Networks (PINNs) on Complex 3D Geometries README.md Benchmark: Physics-Informed Neural Networks (PINNs) on Complex 3D Geometries This benchmark evaluates the computational performance of the PINN methodology described in Backfill Candidate 2512.14941v1. Context & Strategic Alig...	03-29 08:01	Success	-	View
exp_2512.14944v1_20260307_124911 Paper: 2512.14944v1	Puzzle Curriculum GRPO Benchmark Architecture & Methodology PC-GRPO is a post-training reinforcement learning algorithm for VLMs (tested on Qwen-3B/7B). It eliminates external verifiers by using self-supervised "puzzle" environments (PatchFit, Rotation, Jigsaw) to gene...	03-29 08:01	Success	-	View
exp_2512.14946v1_20260306_152604 Paper: 2512.14946v1	EVICPRESS Memory Optimization Benchmark Summary for ARES 8GB Roadmap: * Architecture: A multi-tier KV management system (GPU VRAM to CPU RAM) that jointly optimizes eviction and lossy compression. It utilizes a "unified utility function" to balance quality loss against la...	03-29 08:01	Success	-	View
exp_2512.14946v1_20260307_094824 Paper: 2512.14946v1	EVICPRESS Benchmark Simulation Summary for ARES 8GB Roadmap: * Architecture: A multi-tier KV management system (GPU VRAM to CPU RAM) that jointly optimizes eviction and lossy compression. It utilizes a "unified utility function" to balance quality loss against la...	03-29 08:01	Success	-	View
exp_2512.14954v1_20260307_090240 Paper: 2512.14954v1	```markdown Summary for ARES 8GB Roadmap Architecture: Proposes a probabilistic framework to align teacher and student probability spaces across distinct tokenizers. By exploiting the recursive structure of Byte-Pair Encoding (BPE), it enables...	03-29 08:01	Success	-	View
exp_2512.14961v3_20260307_090321 Paper: 2512.14961v3	Here is the runnable benchmark designed for Backfill Candidate 2512.14961v3 (Hybrid Multimodal Fusion). Architecture: Utilizes a hybrid trimodal framework (face, voice, motion) with independent encoders feeding into a cross-attention and gated fusion module. It employs a single classification head with a confidence-weighted strategy to dy...	03-29 08:01	Success	-	View
exp_2601.10859v1_20260306_165652 Paper: 2601.10859v1	Project ARES: Topology-Inspired KV-Cache Routing README.md Project ARES: Topology-Inspired KV-Cache Routing Innovation: Application of Structural Topology Optimization to LLM Inference. Overview This benchmark validates the concept of using a lightweight "Router Agent" (inspired by th...	03-29 08:01	Success	-	View
exp_2601.10873v1_20260306_165755 Paper: 2601.10873v1	Benchmark: Unit-Consistent (UC) Backpropagation vs. Standard Backprop README.md Benchmark: Unit-Consistent (UC) Backpropagation vs. Standard Backprop Innovation: Backfill Candidate 2601.10873v1 Context: 8GB Efficiency Frontier (RTX A2000 Class) Overview Standard backpropagation in ReLU networks suffer...	03-29 08:01	Failed	RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D	View
exp_2601.10880v1_20260306_140427 Paper: 2601.10880v1	Benchmark: Medical SAM3 VRAM Efficiency Frontier README.md Benchmark: Medical SAM3 VRAM Efficiency Frontier Candidate ID: 2601.10880v1 Subject: Medical SAM3 (3D Transformer Adaptation) System Constraints: 8GB VRAM Limit (RTX A2000 Class) ARES Verdict: DO NOT IMPLEMENT (Har...	03-29 08:01	Success	-	View
exp_2601.10905v1_20260306_171517 Paper: 2601.10905v1	Benchmark: Action Shapley Data Selection Efficiency README.md Benchmark: Action Shapley Data Selection Efficiency Overview This benchmark evaluates the computational efficiency of the Action Shapley data selection methodology introduced in the paper 2601.10905v1. **Innovation Summa...	03-29 08:01	Success	-	View
exp_2601.11557v1_20260307_154325 Paper: 2601.11557v1	Benchmark: Information-Theoretic Binarization (MIB) vs. Float32 HNSW Architecture: Replaces the standard "HNSW + float32" stack with Maximally Informative Binarization (MIB). The system utilizes exhaustive search over 1-bit binary vectors using bitwise distance metrics and Information-Theoretic Scori...	03-29 08:01	Success	-	View
exp_2601.11657v1_20260306_163120 Paper: 2601.11657v1	D-PARC Innovation Benchmark README.md D-PARC Innovation Benchmark This benchmark evaluates the D-PARC (Deformable Physics-Aware Recurrent Convolutions) methodology. It simulates the core "Active Filtration" and "Smarter, Not Bigger" paradigm by comparing a standar...	03-29 08:01	Success	-	View
exp_2601.11659v1_20260306_115302 Paper: 2601.11659v1	Benchmark: Llama 4 Hybrid MoE vs. Dense Inference (8GB VRAM Optimization) README.md Benchmark: Llama 4 Hybrid MoE vs. Dense Inference (8GB VRAM Optimization) Overview This benchmark evaluates the Hardware Awareness and Efficiency Frontier improvements proposed for the Llama 4-inspired architecture on cons...	03-29 08:01	Success	-	View
exp_2601.11659v1_20260306_134246 Paper: 2601.11659v1	This benchmark evaluates the efficiency of the proposed Llama 4-style Mixture of Experts (MoE) architecture against trad... README.md This benchmark evaluates the efficiency of the proposed Llama 4-style Mixture of Experts (MoE) architecture against traditional Dense Transformers within an 8GB VRAM constraint (simulated here for adaptability). Objective To valid...	03-29 08:01	Success	-	View
exp_2601.11660v1_20260306_140352 Paper: 2601.11660v1	MBU-Net Efficiency Benchmark README.md MBU-Net Efficiency Benchmark Innovation: Masked Binary U-Net (MBU-Net) / "Backfill Candidate 2601.11660v1" Objective: Validate memory footprint reduction and inference efficiency via "Cost-Aware Masked Binary Quantization"...	03-29 08:01	Success	-	View
exp_2601.11663v1_20260306_170834 Paper: 2601.11663v1	Benchmark: Unified Activation Sensitivity Framework (ARES Strategy) README.md Benchmark: Unified Activation Sensitivity Framework (ARES Strategy) Overview This benchmark evaluates the "Unified Activation Sensitivity" innovation described in Backfill Candidate 2601.11663v1. The Innovation: The paper pr...	03-29 08:01	Success	-	View
exp_2601.11664v1_20260306_134736 Paper: 2601.11664v1	Here is the runnable benchmark code designed to simulate and evaluate the metrics presented in the "Serverless AI Shield... While the analysis log suggests skipping this for the local 8GB VRAM objective, the benchmark below validates the paper's specific claims regarding cloud-based FaaS security (Detection Rate and Latency Overhead) in a simulated environment....	03-29 08:01	Success	-	View
exp_2602.13871v1_20260306_163409 Paper: 2602.13871v1	Here is the design for the Ens-CGP (Ensemble Conditional Gaussian Process) benchmark, specifically tailored for the... Design Rationale To properly benchmark this innovation without requiring the implementation of the full mathematical engine, we focus on the computational complexity and memory footprint claims of the paper: 1. **Standard Transforme...	03-29 08:01	Success	-	View
exp_2602.13914v1_20260306_135050 Paper: 2602.13914v1	```markdown bash python benchmark.py ```	03-29 08:01	Success	-	View
exp_2602.13914v1_20260306_163305 Paper: 2602.13914v1	Benchmark: Polytopological Propositional Dynamic Logic (PDL) Evaluator README.md Benchmark: Polytopological Propositional Dynamic Logic (PDL) Evaluator Context This benchmark evaluates the computational feasibility of the Polytopological Propositional Dynamic Logic system proposed in the paper "Backfill Ca...	03-29 08:01	Success	-	View
exp_2602.13921v1_20260306_164003 Paper: 2602.13921v1	GREPO-Lite: VRAM Efficiency Benchmark README.md GREPO-Lite: VRAM Efficiency Benchmark This benchmark evaluates the memory efficiency and processing speed of a Graph Neural Network (GNN) architecture similar to GREPO, designed for repository-level bug localization. Objective...	03-29 08:01	Failed	RuntimeError: The expanded size of the tensor (50000) must match the existing size (250000) at non-singleton dimension 0. Target sizes: [50000, 256]. Tensor sizes: [250000, 1]	View
exp_2603.00084v2_20260306_165828 Paper: 2603.00084v2	Title: Benchmark: DeepXiv-SDK (Structured JSON vs. Unstructured PDF) README.md Title: Benchmark: DeepXiv-SDK (Structured JSON vs. Unstructured PDF) Description: This benchmark evaluates the VRAM efficiency of the proposed "DeepXiv-SDK" innovation. The core hypothesis is that shifting from Unstruc...	03-29 08:01	Success	-	View
exp_2603.05437v1_20260306_163229 Paper: 2603.05437v1	```markdown bash python benchmark.py	03-29 08:01	Success	-	View
exp_2603.05451v1_20260306_165623 Paper: 2603.05451v1	Backfill Candidate 2603.05451v1: A2000 "Low-Mem" Adapter README.md Backfill Candidate 2603.05451v1: A2000 "Low-Mem" Adapter Innovation Summary This benchmark validates a derivative strategy extracted from FlashAttention-4 (Candidate 2603.05451v1), adapted for the RTX A2000 (Ampere) archit...	03-29 08:01	Failed	UnboundLocalError: cannot access local variable 'attn_out' where it is not associated with a value	View
exp_2603.05459v1_20260306_134034 Paper: 2603.05459v1	DEBISS Corpus Stress Test Benchmark README.md DEBISS Corpus Stress Test Benchmark Innovation: Backfill Candidate 2603.05459v1 (DEBISS Multi-Modal Corpus) Assessment: High-Compute Load Data Resource Objective: To provide a synthetic, runnable simulation of the memo...	03-29 08:01	Success	-	View
exp_2603.05462v1_20260306_133935 Paper: 2603.05462v1	Benchmark: NCTB-QA Bangla Reading Comprehension This benchmark evaluates the performance of Transformer models (specifically BERT) on the NCTB-QA task (Bangla Reading Comprehension with unanswerable questions). As the full NCTB-QA dataset (87,805 pairs) requires external file handlin...	03-29 08:01	Success	-	View
exp_2603.05462v1_20260306_134402 Paper: 2603.05462v1	2603.05462v1 No summary available yet.	03-29 08:01	Success	-	View
exp_2603.05462v1_20260306_152102 Paper: 2603.05462v1	Benchmark: NCTB-QA Baseline (Backfill Candidate 2603.05462v1) README.md Benchmark: NCTB-QA Baseline (Backfill Candidate 2603.05462v1) Overview This benchmark evaluates the computational efficiency of the method described in the paper "NCTB-QA: A Large-Scale Dataset for Low-Resource Language Question A...	03-29 08:01	Success	-	View
exp_2603.05468v1_20260306_163914 Paper: 2603.05468v1	Benchmark: Neural Quantum Estimator with Kraus Constraints README.md Benchmark: Neural Quantum Estimator with Kraus Constraints Overview This benchmark evaluates the "Kraus-structured output layer" innovation, applied to a Mamba-like Linear SSM architecture. It simulates the inference effic...	03-29 08:01	Failed	RuntimeError: expected a matrix	View
exp_2603.05485v1_20260306_163742 Paper: 2603.05485v1	Here is the runnable benchmark design. Per the analysis that running a full Judge model is infeasible for 8GB VRAM, this... No summary available yet.	03-29 08:01	Success	-	View
exp_2603.05495v1_20260306_231345 Paper: 2603.05495v1	```markdown README.md bash pip install torch numpy scipy bash python benchmark.py	03-29 08:01	Success	-	View
exp_2603.05498v1_20260306_082531 Paper: 2603.05498v1	2603.05498v1 No summary available yet.	03-29 08:01	Success	-	View
exp_2603.05504v1_20260306_155252 Paper: 2603.05504v1	```markdown No summary available yet.	03-29 08:01	Success	-	View
exp_2603.05507v1_20260307_161715 Paper: 2603.05507v1	Transformer-Based Inpainting for Sparse 3D Streaming Benchmark README.md Transformer-Based Inpainting for Sparse 3D Streaming Benchmark This benchmark evaluates the performance of a simplified, synthetic implementation of the proposed "Transformer-Based Inpainting" module designed for real-time 3D stre...	03-29 08:01	Success	-	View
exp_core_304987179_20260307_080420 Paper: core_304987179	Benchmark: RazorAttention KV Cache Compression README.md Benchmark: RazorAttention KV Cache Compression This repository contains a lightweight, runnable benchmark simulating the RazorAttention technique for efficient KV cache compression. Overview RazorAttention optimizes Long-Conte...	03-29 08:01	Success	-	View
exp_cr_10.1007_s12046-026-03064-1_20260307_082916 Paper: cr_10.1007_s12046-026-03064-1	Benchmark: Hybrid EfficientNet-B7 + ViT Candidate README.md Benchmark: Hybrid EfficientNet-B7 + ViT Candidate Candidate ID: cr_10.1007_s12046-026-03064-1 Overview This benchmark evaluates the computational feasibility of the proposed hybrid architecture combining EfficientNet-B7 wi...	03-29 08:01	Pending	-	View
exp_cr_10.1007_s12046-026-03064-1_20260307_083140 Paper: cr_10.1007_s12046-026-03064-1	Here is the runnable benchmark design for the candidate innovation. No summary available yet.	03-29 08:01	Success	-	View
exp_cr_10.1007_s42399-026-02316-9_20260307_162004 Paper: cr_10.1007_s42399-026-02316-9	```markdown bash pip install torch transformers bash python benchmark.py	03-29 08:01	Success	-	View
exp_cr_10.1038_s41598-026-39986-3_20260307_095821 Paper: cr_10.1038_s41598-026-39986-3	--- README.md --- MI-SOH: Multi-scale Inverted Transformer Benchmark Overview This benchmark implements the MI-SOH (Multi-scale Inverted Transformer for State-of-Health) architecture described in the innovation candidate. It combines **dila...	03-29 08:01	Success	-	View
exp_cr_10.1038_s41698-025-01103-4_20260307_113300 Paper: cr_10.1038_s41698-025-01103-4	Benchmark: LLM-AIx (Local Information Extraction) Summary: LLM-AIx Pipeline for Oncology * Architecture: The paper outlines LLM-AIx, a software protocol acting as a wrapper for open-source, privacy-preserving LLMs. It is designed to extract structured clinical data (e.g., TNM s...	03-29 08:01	Success	-	View
exp_cr_10.1088_1361-6501_ae46b7_20260307_081748 Paper: cr_10.1088_1361-6501_ae46b7	Here is the benchmark for the Bi-Mamba Time Series Regression architecture. bash python benchmark.py	03-29 08:01	Success	-	View
exp_cr_10.1145_3768167_20260307_105805 Paper: cr_10.1145_3768167	Section 1: README.md Architecture The paper proposes a Graph-Transformer Network (GTN) acting as a surrogate model for circuit topology optimization. It encodes circuit physics specifically—voltage changes in loops and current flows—directly into graph embe...	03-29 08:01	Success	-	View
exp_cr_10.1515_jiip-2022-0050_20260307_160722 Paper: cr_10.1515_jiip-2022-0050	Benchmark: Multi-Fidelity Elasticity Surrogate (cr_10.1515_jiip-2022-0050) Architecture Proposes a multi-fidelity framework combining a low-fidelity Deep Neural Network (DNN) surrogate with a high-fidelity physical model for Bayesian inference on elastic properties. The DNN handles the bulk of the prior distri...	03-29 08:01	Success	-	View
exp_cr_10.1609_aaai.v38i12.29197_20260307_083540 Paper: cr_10.1609_aaai.v38i12.29197	Benchmark: Excel Transformer (60M Params) Architecture: FLAME is a 60M parameter Transformer optimized specifically for Excel formulas. Key architectural differentiators include an Excel-specific tokenizer and domain-adapted pre-training objectives: masked span prediction and n...	03-29 08:01	Success	-	View
exp_cr_10.1609_aaai.v38i16.29765_20260307_124634 Paper: cr_10.1609_aaai.v38i16.29765	Benchmark: The Lens of Perturbation in LLM Quantization Architecture: Introduces a "perturbation lens" framework, analyzing quantization error as additive noise to weights and activations. This theory supports a non-uniform quantization scheme that adapts grid spacing to activation sensitivi...	03-29 08:01	Success	-	View
exp_cr_10.1609_aaai.v38i17.29815_20260307_124257 Paper: cr_10.1609_aaai.v38i17.29815	Benchmark: Norm Tweaking for Low-Bit LLM Quantization Architecture: A plugin for existing Post-Training Quantization (PTQ) pipelines. It does not alter core Transformer blocks but modifies Layer Normalization weights. The method aligns the distribution of quantized activations with their f...	03-29 08:01	Success	-	View
exp_cr_10.1609_aaai.v38i17.29822_20260307_161200 Paper: cr_10.1609_aaai.v38i17.29822	Benchmark: LatestEval Dynamic Evaluation Protocol README.md Benchmark: LatestEval Dynamic Evaluation Protocol Innovation: LatestEval (AAAI 2024) Concept: A dynamic evaluation protocol that constructs tests from "future" data (published after model training cutoffs) to mitigate data...	03-29 08:01	Success	-	View
exp_cr_10.1609_aaai.v38i21.30443_20260307_110307 Paper: cr_10.1609_aaai.v38i21.30443	Benchmark: Structured Prompting for Bias Mitigation Summary for ARES 8GB Roadmap * Architecture: This research proposes a software-layer methodology rather than a neural architecture. It utilizes existing Transformer-based models, relying on structured prompt engineering (context...	03-29 08:01	Success	-	View
exp_cr_10.2196_67967_20260307_081828 Paper: cr_10.2196_67967	Benchmark for Backfill Candidate: cr_10.2196_67967 Architecture: The study evaluates a fine-tuned `scispaCy` model against two domain-specific LLMs: NYUTron (110M parameters) and GatorTron (345M parameters). Both are highly optimized "tiny" architectures suitable for clinical NL...	03-29 08:01	Success	-	View
exp_cr_10.24252_literatify.v5i1.44458_20260307_094222 Paper: cr_10.24252_literatify.v5i1.44458	Benchmark: Classical VSM Retrieval-Augmented Generation (RAG) Report: Literature Review on Vector Space Models (VSM) Type: Literature Review (Traditional Information Retrieval) Relevance: Low (Non-Neural), but applicable to RAG preprocessing. * Architecture: Analyzes the classic **Vect...	03-29 08:01	Success	-	View
exp_cr_10.24425_jppr.2024.151253_20260307_102906 Paper: cr_10.24425_jppr.2024.151253	Benchmark: Hybrid Swin-Transformer YOLOv5 vs. Standard CNN Architecture: Modifies the YOLOv5m baseline by integrating a Swin Transformer (Swin-T) module into the backbone network. It also utilizes K-means++ for anchor optimization and Efficient IoU (EIoU) loss to improve bounding box regression...	03-29 08:01	Success	-	View
exp_cr_10.29019_enfoqueute.1204_20260307_071307 Paper: cr_10.29019_enfoqueute.1204	```markdown README.md bash pip install torch transformers accelerate psutil bash python benchmark.py	03-29 08:01	Pending	-	View
exp_cr_10.29019_enfoqueute.1204_20260307_110702 Paper: cr_10.29019_enfoqueute.1204	This benchmark evaluates the performance efficiency of Mamba, a State Space Model (SSM), compared to a traditional T... README.md This benchmark evaluates the performance efficiency of Mamba, a State Space Model (SSM), compared to a traditional Transformer architecture (GPT-2). The innovation of Mamba lies in its ability to maintain linear time complexit...	03-29 08:01	Success	-	View
exp_cr_10.3233_mas-221411_20260307_124500 Paper: cr_10.3233_mas-221411	--- README.md Benchmark: Bayesian Inference with Smoothed Dirichlet Priors This repository contains a runnable benchmark designed to evaluate the computational performance and accuracy of Bayesian inference using Smoothed Dirichlet Priors o...	03-29 08:01	Success	-	View
exp_cr_10.3389_frobt.2025.1518965_20260307_080556 Paper: cr_10.3389_frobt.2025.1518965	Model Compression Benchmark: Precision Reduction & Pruning This paper provides a comprehensive methodological framework for optimizing Large Language Models (LLMs) within the ARES 8GB hardware constraints. As a survey, it does not propose a specific architecture but evaluates compression techniques...	03-29 08:01	Success	-	View
exp_cr_10.3390_agronomy14040673_20260307_085702 Paper: cr_10.3390_agronomy14040673	Benchmark: Hybrid CNN-Transformer for Agronomy (cr_10.3390_agronomy14040673) Architecture: Hybrid framework combining a Densely Connected CNN for multilevel local feature extraction with a Transformer module for global context capture. A Cycle-GAN is utilized for training data augmentation but is excluded during...	03-29 08:01	Success	-	View
exp_cr_10.3390_app14188526_20260307_103037 Paper: cr_10.3390_app14188526	SA-LSTM Time Series Regression Benchmark Summary for ARES 8GB Roadmap * Architecture: The paper proposes a hybrid Long Short-Term Memory (LSTM) network integrated with a Self-Attention Mechanism (SA-LSTM). This architecture weights specific time-steps in the input...	03-29 08:01	Success	-	View
exp_cr_10.3390_designs10020030_20260307_103414 Paper: cr_10.3390_designs10020030	Benchmark: Local VLM Viability on ARES 8GB Roadmap README.md Benchmark: Local VLM Viability on ARES 8GB Roadmap Context: The target candidate (`cr_10.3390_designs10020030`) proposes a cloud-centric hybrid architecture utilizing the ChatGPT API. The review highlights that this is **Low F...	03-29 08:01	Success	-	View
exp_cr_10.3390_electronics13183710_20260307_082805 Paper: cr_10.3390_electronics13183710	Section 1: README.md Architecture: Hybrid model utilizing multi-scale frequency decomposition. High-frequency data is processed via a Temporal GNN with an Adaptive Graph Learning module, while low-frequency data uses a Bidirectional Temporal Network, fused...	03-29 08:01	Success	-	View
exp_cr_10.3390_en18184924_20260307_113225 Paper: cr_10.3390_en18184924	Section 1: README.md Architecture: The proposed model is a hybrid statistical system combining Monte Carlo filters for state estimation with a clustering algorithm (likely K-Means or similar) for outlier removal and forecasting. It is not a neural network o...	03-29 08:01	Success	-	View
exp_cr_10.3390_math12182941_20260307_083419 Paper: cr_10.3390_math12182941	Benchmark: Arabic Transformer Ensemble (AMFND) Architecture: Proposes a weighted-average ensemble of five heterogeneous Arabic Transformers (AraBERT, MARBERT, AraELECTRA, AraGPT2, ARBERT). Memory Footprint: Critical Bottleneck. Concurrently loading five distinct encoder/deco...	03-29 08:01	Success	-	View
exp_cr_10.3390_rs17183200_20260307_085344 Paper: cr_10.3390_rs17183200	TransMambaCNN Benchmark Architecture TransMambaCNN utilizes a dual-branch topology to fuse global and local spatiotemporal features. The global branch replaces standard self-attention with a Convolutional State-Space Module (C-SSM), combining an Attentive...	03-29 08:01	Success	-	View
exp_cr_10.3390_rs18050793_20260307_081336 Paper: cr_10.3390_rs18050793	Here is the benchmark design for the underwater fusion architecture with Variable Mixture-of-Experts (vMoE). README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_cr_10.3390_rs18050793_20260307_081511 Paper: cr_10.3390_rs18050793	Benchmark: Underwater Fusion vMoE (cr_10.3390_rs18050793) README.md Benchmark: Underwater Fusion vMoE (cr_10.3390_rs18050793) Overview This benchmark evaluates the performance characteristics of the Variable Mixture-of-Experts (vMoE) mechanism proposed for fusing camera and sonar data in under...	03-29 08:01	Success	-	View
exp_cr_10.3390_s24072091_20260307_161606 Paper: cr_10.3390_s24072091	Benchmark: Bayesian Neural Network (BNN) Surrogate for Structural Health Monitoring Paper Analysis: BNNs for Structural Health Monitoring (SHM) Architecture: The paper proposes a Bayesian Neural Network (BNN) utilizing probabilistic inference to predict structural displacement. It operates within a "dual-drive"...	03-29 08:01	Success	-	View
exp_cr_10.3390_s25185786_20260307_085434 Paper: cr_10.3390_s25185786	MFT-Net: Hybrid CNN-Transformer Benchmark Architecture The paper proposes MFT-Net, a hybrid architecture that integrates a Convolutional Neural Network (CNN) for local feature extraction with a Transformer module for global dependency modeling. It utilizes Squeeze-and-Excitatio...	03-29 08:01	Success	-	View
exp_cr_10.3390_s25185805_20260307_125105 Paper: cr_10.3390_s25185805	FILE_BREAK Architecture: Uses a customized BLIP-2 framework with a Q-Former to fuse heterogeneous inputs (visual frames, kinematic data) into low-dimensional embeddings representing "task demand" and "driving capability" within a shared latent...	03-29 08:01	Pending	-	View
exp_cr_10.3390_s25185805_20260307_155403 Paper: cr_10.3390_s25185805	This benchmark evaluates the performance characteristics of the BLIP-2 architecture when utilized for embedding extracti... Architecture: Uses a customized BLIP-2 framework with a Q-Former to fuse heterogeneous inputs (visual frames, kinematic data) into low-dimensional embeddings representing "task demand" and "driving capability" within a shared latent...	03-29 08:01	Success	-	View
exp_cr_10.3390_sym17030471_20260307_154914 Paper: cr_10.3390_sym17030471	Benchmark: Improved Model-Free Adaptive Predictive Control (MFAPC) under DoS and Quantization Verdict: Incompatible This paper addresses Control Theory (Model-Free Adaptive Predictive Control), not Deep Learning. It focuses on networked cyber-physical systems under DoS attacks and does not describe a neural network architect...	03-29 08:01	Success	-	View
exp_cr_10.36724_2072-8735-2024-18-3-41-49_20260307_110401 Paper: cr_10.36724_2072-8735-2024-18-3-41-49	Backfill Candidate: cr_10.36724_2072-8735-2024-18-3-41-49 Status: Irrelevant This paper addresses telecommunications protocols (specifically queueing theory and traffic shaping for high-throughput satellites), not Deep Learning. * Architecture: N/A. The paper proposes a mathematical pr...	03-29 08:01	Success	-	View
exp_cr_10.3897_jucs.94657_20260307_160925 Paper: cr_10.3897_jucs.94657	Section 1: README.md PlantKViT Architecture Benchmark This benchmark evaluates the performance characteristics of the PlantKViT hybrid architecture (Vision Transformer + KNN Classifier). Architecture Overview The benchmark simulates the deployment scenario...	03-29 08:01	Success	-	View
exp_cr_10.51519_journalisi.v7i1.1024_20260307_093251 Paper: cr_10.51519_journalisi.v7i1.1024	--- Subject: IT-Based Knowledge Sharing System with LLM Integration Architecture: Conceptual system architecture proposing the integration of Large Language Models (specifically ChatGPT) into university IT ticketing systems. The design...	03-29 08:01	Pending	-	View
exp_cr_10.51519_journalisi.v7i1.1024_20260307_095059 Paper: cr_10.51519_journalisi.v7i1.1024	Benchmark: Local Knowledge Sharing System (RAG-Lite) Subject: IT-Based Knowledge Sharing System with LLM Integration Architecture: Conceptual system architecture proposing the integration of Large Language Models (specifically ChatGPT) into university IT ticketing systems. The design...	03-29 08:01	Success	-	View
exp_cr_10.52783_jisem.v10i3.4744_20260307_083344 Paper: cr_10.52783_jisem.v10i3.4744	This benchmark evaluates the computational efficiency of a hybrid Enhanced Vision Transformer (EViT) + BiLSTM archit... Architecture: The paper proposes a hybrid architecture combining an Enhanced Vision Transformer (EViT) with a Bidirectional LSTM (BiLSTM) for glaucoma detection. The EViT extracts global spatial features, while the BiLSTM processes sequ...	03-29 08:01	Success	-	View
exp_cr_10.55041_ijsrem57223_20260307_103235 Paper: cr_10.55041_ijsrem57223	BiLAT Architecture Benchmark This benchmark implements a representative BiLAT (Bidirectional LSTM with Attention and Transformer components) model to verify the architectural claims regarding memory footprint and inference speed. Architecture Details The implemente...	03-29 08:01	Success	-	View
exp_cr_10.58414_scientifictemper.2025.16.2.03_20260307_110040 Paper: cr_10.58414_scientifictemper.2025.16.2.03	Summary of reasoning Analysis for ARES 8GB Roadmap * Architecture: The MRMGKTL model combines a standard Transformer encoder with a Gaussian Kernel classifier. Crucially, it utilizes a pre-processing pipeline involving Sokal–Michener’s multivariate reli...	03-29 08:01	Success	-	View
exp_gh_Dao-AILab_flash-attention_20260307_164230 Paper: gh_Dao-AILab_flash-attention	This repository contains a minimal benchmark to evaluate the performance and memory efficiency of Dao-AILab/flash-atte... README.md This repository contains a minimal benchmark to evaluate the performance and memory efficiency of Dao-AILab/flash-attention**. Overview Flash Attention is a precise attention algorithm that significantly reduces memory usage (HB...	03-29 08:01	Success	-	View
exp_gh_EvanVOSSIER_birdnet-onnx-converter_20260307_215337 Paper: gh_EvanVOSSIER_birdnet-onnx-converter	Benchmark: EvanVOSSIER/birdnet-onnx-converter README.md Benchmark: EvanVOSSIER/birdnet-onnx-converter This benchmark evaluates the inference performance of BirdNET models converted to ONNX format. It focuses on measuring the throughput (audio processed per second) and memory usage (VRA...	03-29 08:01	Success	-	View
exp_gh_huggingface_transformers_20260307_170900 Paper: gh_huggingface_transformers	Hugging Face Transformers Inference Benchmark README.md Hugging Face Transformers Inference Benchmark This repository contains a focused benchmark designed to evaluate the inference performance of the `huggingface/transformers` library. The objective is to measure the efficiency of a s...	03-29 08:01	Success	-	View
exp_gh_robloxexploiterponole_aegis-trainer_20260308_000255 Paper: gh_robloxexploiterponole_aegis-trainer	AEGIS AI Trainer: Layer-Streaming Benchmark README.md AEGIS AI Trainer: Layer-Streaming Benchmark This benchmark demonstrates the core innovation behind AEGIS AI Trainer: the ability to train massive Mixture of Experts (MoE) and dense models (80B+ parameters) on consumer hardware...	03-29 08:01	Success	-	View
exp_gh_svg-project_Sparse-VideoGen_20260307_231058 Paper: gh_svg-project_Sparse-VideoGen	Here is the benchmark design for the svg-project/Sparse-VideoGen innovation. This benchmark focuses on the core efficiency claim: replacing Dense Global Attention with Sparse Sliding-Window Attention to reduce VRAM usage and increase throughput in Video Diffusion Transformers. --- README.md Benchmark: Sparse VideoGe...	03-29 08:01	Success	-	View
exp_gh_vllm-project_vllm_20260307_162231 Paper: gh_vllm-project_vllm	vLLM Benchmark Suite README.md vLLM Benchmark Suite This benchmark evaluates the inference performance of vLLM, a high-throughput and memory-efficient inference engine. It focuses on measuring the engine's ability to manage KV Cache memory (PagedAttention)...	03-29 08:01	Success	-	View
exp_hf_2603.03942_20260308_040339 Paper: hf_2603.03942	Benchmark: Lightweight Visual Reasoning Feedback Loop This benchmark simulates the architectural difference between a standard Vision-Language Model (VLM) and the proposed Lightweight Visual Reasoning approach. The Innovation: The paper introduces a "language-to-vision feedback module"...	03-29 08:01	Success	-	View
exp_hf_2603.04800_20260307_070100 Paper: hf_2603.04800	Benchmark: MASQuant Modality-Aware Quantization README.md Benchmark: MASQuant Modality-Aware Quantization This benchmark evaluates the performance characteristics of MASQuant (Modality-Aware Smoothing Quantization) principles applied to a Multimodal Large Language Model (MLLM) archit...	03-29 08:01	Pending	-	View
exp_hf_2603.04800_20260307_070855 Paper: hf_2603.04800	Benchmark: MASQuant (Modality-Aware Smoothing Quantization) README.md Benchmark: MASQuant (Modality-Aware Smoothing Quantization) This repository provides a lightweight, synthetic benchmark to evaluate the core performance benefits of MASQuant, specifically focusing on its ability to handle...	03-29 08:01	Success	-	View
exp_hf_2603.04800_20260307_164050 Paper: hf_2603.04800	MASQuant Benchmark Suite README.md MASQuant Benchmark Suite This benchmark evaluates the Modality-Aware Smoothing Quantization (MASQuant) framework. Overview MASQuant addresses "Smoothing Misalignment" and "Cross-Modal Computational Invariance" in Multimodal LL...	03-29 08:01	Success	-	View
exp_oa_W4415031789_20260307_090407 Paper: oa_W4415031789	Here is the benchmark design to validate the findings of the T2I survey paper (Backfill Candidate oa_W4415031789), speci... Architecture: Surveys 141 T2I works (2021–2024), categorizing them into Autoregressive, GAN, and Diffusion foundations. Highlights Mamba and Multimodality as emerging architectures for future performance gains, potentially offering...	03-29 08:01	Success	-	View
exp_oa_W4415248384_20260307_081445 Paper: oa_W4415248384	Innovation Benchmark: Mamba vs Transformer for 6G Edge Inference Subject: Analysis of A Comprehensive Survey of Large AI Models for Future Communications This survey evaluates Large AI Models (LAMs) for 6G, reviewing Transformers, Diffusion, and Mamba architectures. Key takeaways for the ARES 8...	03-29 08:01	Success	-	View
exp_oa_W4415248384_20260307_081717 Paper: oa_W4415248384	Thought Process for Code Generation: Subject: Analysis of A Comprehensive Survey of Large AI Models for Future Communications This survey evaluates Large AI Models (LAMs) for 6G, reviewing Transformers, Diffusion, and Mamba architectures. Key takeaways for the ARES 8...	03-29 08:01	Success	-	View
exp_oa_W7133137559_20260308_082020 Paper: oa_W7133137559	Section 1: README.md Architecture: Theoretical analysis of Transformer embeddings and the $O(n^2)$ complexity of attention mechanisms. Reviews optimization techniques including token pruning, sparse attention, and long-context extensions. **Memory Footprint...	03-29 08:01	Success	-	View
exp_pytrain.20260307082736.001_20260307_082807 Paper: pytrain.20260307082736.001	Python Skill Fallback Title: Automated Package Builder and Strict Type Verifier - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307083022.001_20260307_083054 Paper: pytrain.20260307083022.001	Strictly-Typed Package Dependency Resolver README.md Strictly-Typed Package Dependency Resolver Overview This benchmark implements a robust dependency resolution engine using Python's standard `typing` module. It leverages `Protocol`, `TypedDict`, and Type Aliases to enforce strict...	03-29 08:01	Success	-	View
exp_pytrain.20260307083656.002_20260307_083722 Paper: pytrain.20260307083656.002	Overview README.md Overview This benchmark evaluates the implementation of a Dynamic Generic Plugin Loader utilizing PEP 695 Type Parameter Syntax (available in Python 3.12+). Key Features 1. PEP 695 Implementation: Defines `PluginRegist...	03-29 08:01	Success	-	View
exp_pytrain.20260307085212.003_20260307_085251 Paper: pytrain.20260307085212.003	Strict Type-Hinted Package Builder and Validator README.md Strict Type-Hinted Package Builder and Validator Description This benchmark tests an autonomous coding system's ability to programmatically construct a PEP 561 compliant Python package and validate its structural and type integrit...	03-29 08:01	Success	-	View
exp_pytrain.20260307090113.004_20260307_090149 Paper: pytrain.20260307090113.004	Dynamic Package Scaffolder with Runtime Type Verification This benchmark evaluates an agent's ability to programmatically generate a Python package structure that adheres to packaging standards (PEP 8) and utilizes advanced typing protocols (PEP 484/585). Objective Implement a function `build_and_...	03-29 08:01	Success	-	View
exp_pytrain.20260307093135.001_20260307_093213 Paper: pytrain.20260307093135.001	Self-Introspecting Typed Plugin System README.md Self-Introspecting Typed Plugin System This benchmark demonstrates a robust, self-contained plugin architecture using Python's standard library. It simulates a Python package environment by dynamically generating plugin modules at...	03-29 08:01	Success	-	View
exp_pytrain.20260307094021.001_20260307_094050 Paper: pytrain.20260307094021.001	This benchmark demonstrates the creation of a dynamic, in-memory Python package structure without writing files to disk.... README.md This benchmark demonstrates the creation of a dynamic, in-memory Python package structure without writing files to disk. It utilizes `sys.modules` and `types` to simulate a package named `internal_plugins` containing dynamically g...	03-29 08:01	Success	-	View
exp_pytrain.20260307094405.001_20260307_094459 Paper: pytrain.20260307094405.001	Design rationale: The `benchmark.py` script is designed to fulfill the "Runtime-Verified Plugin Architecture" requirement. 1. Typing: It defines a `DataProcessor[T]` Protocol using `typing` module features. 2. Packaging: It uses `pathlib` to create a...	03-29 08:01	Success	-	View
exp_pytrain.20260307094639.001_20260307_094713 Paper: pytrain.20260307094639.001	Python Skill Fallback Title: Strictly Typed Plugin Architecture with Packaging Simulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307095241.002_20260307_095316 Paper: pytrain.20260307095241.002	PEP 695 Generic Resource Pool Benchmark This benchmark evaluates the implementation of a generic resource pool using Python 3.12+'s PEP 695 Type Parameter Syntax. Overview Hypothesis: Utilizing Python 3.12+ Type Parameter Syntax allows for more concise and maintainable generi...	03-29 08:01	Success	-	View
exp_pytrain.20260307095939.003_20260307_100015 Paper: pytrain.20260307095939.003	Benchmark: Dynamic Package Loader with Protocol Enforcement README.md Benchmark: Dynamic Package Loader with Protocol Enforcement Objective To evaluate the ability of a Python system to dynamically load code from a temporary file system structure and enforce strict type safety using `typing.Protocol...	03-29 08:01	Success	-	View
exp_pytrain.20260307100559.004_20260307_100638 Paper: pytrain.20260307100559.004	Robust CLI Configuration Merger README.md Robust CLI Configuration Merger Objective This benchmark evaluates the ability to write a robust, type-safe Python utility that performs a recursive deep merge of JSON configurations. The solution must adhere to strict static typi...	03-29 08:01	Success	-	View
exp_pytrain.20260307102243.005_20260307_102322 Paper: pytrain.20260307102243.005	Python Skill Fallback Title: Robust Typed Plugin Loader with Namespace Inspection - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307102938.006_20260307_102957 Paper: pytrain.20260307102938.006	Python Skill Fallback Title: Typing-Driven Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307103844.007_20260307_103921 Paper: pytrain.20260307103844.007	Type-Safe Generic Component Registry Benchmark README.md Title: Type-Safe Generic Component Registry Benchmark Description This benchmark evaluates the implementation of a modular, type-safe dependency-injection style registry system using Python's standard library. It focuses on struct...	03-29 08:01	Success	-	View
exp_pytrain.20260307104523.008_20260307_104609 Paper: pytrain.20260307104523.008	Python Skill Fallback Title: Generic Task Queue with Package Metadata - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307105159.009_20260307_105234 Paper: pytrain.20260307105159.009	Self-Contained ZipApp Generator with Type Safety This benchmark tests the ability to programmatically generate a strictly-typed Python package structure, compile it into a executable Zip Application (`.pyz`) using the standard library, and verify its execution integrity. Requirements - Py...	03-29 08:01	Success	-	View
exp_pytrain.20260307105923.010_20260307_105948 Paper: pytrain.20260307105923.010	Dynamic Plugin Loader with Strict Protocol Validation README.md Dynamic Plugin Loader with Strict Protocol Validation Overview This benchmark demonstrates a robust, zero-dependency plugin architecture in Python. It utilizes `importlib` for dynamic discovery and loading of modules from a tempor...	03-29 08:01	Success	-	View
exp_pytrain.20260307110558.011_20260307_110624 Paper: pytrain.20260307110558.011	Type-Safe Extensible Log Formatter This coding drill evaluates a system's ability to design a robust, extensible logging architecture using Python's advanced type hinting system. The focus is on defining structural interfaces (`Protocol`), creating generic containers for dyn...	03-29 08:01	Success	-	View
exp_pytrain.20260307112731.012_20260307_112753 Paper: pytrain.20260307112731.012	Auto-Registry System Benchmark README.md Auto-Registry System Benchmark This benchmark evaluates the implementation of a robust, dynamic class registry system using Python's standard library. It simulates a modular plugin architecture, similar to those found in Hugging F...	03-29 08:01	Success	-	View
exp_pytrain.20260307113452.013_20260307_113530 Paper: pytrain.20260307113452.013	Generic Registry with Dynamic Module Discovery README.md Generic Registry with Dynamic Module Discovery This benchmark demonstrates a decoupled plugin architecture using Python's standard library. Design Philosophy Modern frameworks require extensibility without modifying core logic. Th...	03-29 08:01	Success	-	View
exp_pytrain.20260307124109.014_20260307_124125 Paper: pytrain.20260307124109.014	```markdown bash python benchmark.py	03-29 08:01	Success	-	View
exp_pytrain.20260307124707.015_20260307_124728 Paper: pytrain.20260307124707.015	```markdown bash python3.12 benchmark.py	03-29 08:01	Success	-	View
exp_pytrain.20260307153616.001_20260307_153705 Paper: pytrain.20260307153616.001	This benchmark evaluates a data transformation pipeline design that leverages Python's `typing.Protocol`, Generics (`Typ... README.md This benchmark evaluates a data transformation pipeline design that leverages Python's `typing.Protocol`, Generics (`TypeVar`), and `typing` module features to enforce structural typing and type safety. Design Principles 1. **Prot...	03-29 08:01	Success	-	View
exp_pytrain.20260307154009.001_20260307_154041 Paper: pytrain.20260307154009.001	Python Skill Fallback Title: Strictly-Typed Plugin Registry with PEP 562 Lazy Loading - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307154643.002_20260307_154718 Paper: pytrain.20260307154643.002	PEP 695 Dynamic Package Benchmark README.md PEP 695 Dynamic Package Benchmark This benchmark evaluates an autonomous coding system's ability to generate and verify modern Python typing constructs (PEP 695) within a dynamic file structure. Objective The script programmatical...	03-29 08:01	Success	-	View
exp_pytrain.20260307155254.003_20260307_155324 Paper: pytrain.20260307155254.003	Python Reliability Drill: Typing & Packaging Benchmark README.md Python Reliability Drill: Typing & Packaging Benchmark This benchmark evaluates a candidate's ability to implement robust utilities focusing on static analysis, type hint validation, and package structure verification using only t...	03-29 08:01	Success	-	View
exp_pytrain.20260307160444.004_20260307_160531 Paper: pytrain.20260307160444.004	```markdown README.md bash python benchmark.py RUNNING SELF-TESTS... [OK] ... BENCHMARKING... VRAM_USAGE: <value>MB TOKENS_PER_SEC: <value> VERIFIED: ... ---	03-29 08:01	Success	-	View
exp_pytrain.20260307161054.005_20260307_161123 Paper: pytrain.20260307161054.005	Runtime Package Composition with Generic Protocols This benchmark evaluates your ability to programmatically construct a Python package hierarchy using standard library modules like `types` and `importlib`, while enforcing strict type safety using `typing.Protocol` and `typing.Generic`. Obj...	03-29 08:01	Success	-	View
exp_pytrain.20260307161841.006_20260307_161914 Paper: pytrain.20260307161841.006	Dynamic Module Loader with Protocol Enforcement README.md Dynamic Module Loader with Protocol Enforcement Objective This benchmark tests the ability to dynamically construct a local package structure at runtime, discover modules using `importlib`, and rigorously enforce interface complia...	03-29 08:01	Success	-	View
exp_pytrain.20260307163924.007_20260307_164000 Paper: pytrain.20260307163924.007	Typed Component Registry System Benchmark README.md Typed Component Registry System Benchmark Overview This benchmark demonstrates a scalable Python package structure using structural subtyping (`typing.Protocol`) and a registration-based architecture. It simulates a scenar...	03-29 08:01	Success	-	View
exp_pytrain.20260307164606.008_20260307_164638 Paper: pytrain.20260307164606.008	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307165300.009_20260307_165324 Paper: pytrain.20260307165300.009	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307165852.010_20260307_165921 Paper: pytrain.20260307165852.010	Benchmark: Concurrent ZipApp Packager README.md Benchmark: Concurrent ZipApp Packager Overview This benchmark evaluates a Python engineer's ability to construct a robust, standalone CLI packaging tool. The core task involves building `packager.py`, which demonstrates concurrent...	03-29 08:01	Success	-	View
exp_pytrain.20260307170449.011_20260307_170515 Paper: pytrain.20260307170449.011	```markdown bash python benchmark.py text VRAM_USAGE: 0MB TOKENS_PER_SEC: <calculated_speed> VERIFIED: All plugins loaded and structural typing checks passed.	03-29 08:01	Success	-	View
exp_pytrain.20260307171105.012_20260307_171140 Paper: pytrain.20260307171105.012	Benchmark: Typed CLI Tool for Hyperparameter Validation README.md Benchmark: Typed CLI Tool for Hyperparameter Validation Objective This benchmark evaluates the robustness and efficiency of a Python-based CLI tool designed to validate machine learning training configurations. The implementation...	03-29 08:01	Success	-	View
exp_pytrain.20260307171803.013_20260307_171829 Paper: pytrain.20260307171803.013	Type-Safe Plugin Registry with Semantic Version Resolution README.md This benchmark evaluates a Python system's capability to manage a type-safe plugin architecture using only the standard library. Overview The system implements a `Plugin` Protocol and a central `Registry`. It demonstrates: 1. **Dy...	03-29 08:01	Success	-	View
exp_pytrain.20260307172409.014_20260307_172438 Paper: pytrain.20260307172409.014	Dynamic Component Registry with Runtime Type Validation README.md Dynamic Component Registry with Runtime Type Validation This coding drill benchmarks your ability to design a robust, plugin-based architecture in Python using only the standard library. Objective You must construct a single execu...	03-29 08:01	Success	-	View
exp_pytrain.20260307173023.015_20260307_173142 Paper: pytrain.20260307173023.015	Python Reliability Drill: Typing & Generics README.md Python Reliability Drill: Typing & Generics This benchmark evaluates a Python engineer's ability to implement robust, type-safe utilities using the standard library. Overview The drill implements a `TypedStore` utility leveraging...	03-29 08:01	Success	-	View
exp_pytrain.20260307173722.016_20260307_173759 Paper: pytrain.20260307173722.016	Python Skill Fallback Title: Runtime Type-Safe Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307174438.017_20260307_174505 Paper: pytrain.20260307174438.017	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307175102.018_20260307_175132 Paper: pytrain.20260307175102.018	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307180633.019_20260307_180705 Paper: pytrain.20260307180633.019	Python Skill Fallback Title: Generic Model Factory with Type-Safe Configuration - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307181235.020_20260307_181305 Paper: pytrain.20260307181235.020	Benchmark: Dynamic Module Loader with Structural Type Verification README.md Benchmark: Dynamic Module Loader with Structural Type Verification Objective This benchmark evaluates the robustness of a dynamic plugin loading system in Python. It simulates a high-performance environment (similar to LLM kernel...	03-29 08:01	Success	-	View
exp_pytrain.20260307181930.021_20260307_182014 Paper: pytrain.20260307181930.021	Python Skill Fallback Title: Robust Plugin Loader with Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307182621.022_20260307_182648 Paper: pytrain.20260307182621.022	Generic Result Monad with PEP 695 This drill implements a robust `Result[T, E]` Monad (Generic Wrapper) using Python 3.12+ features. Features * PEP 695 Type Parameters: Uses the new syntax `class Result[T, E]:` instead of `typing.Generic`. * Module Structure: Explic...	03-29 08:01	Success	-	View
exp_pytrain.20260307184150.023_20260307_184226 Paper: pytrain.20260307184150.023	Python Reliability Drill: Runtime Typing & Validation README.md Python Reliability Drill: Runtime Typing & Validation Objective This benchmark evaluates the robustness and reliability of a Python utility designed to perform runtime type validation using the standard `typing` module. The goal i...	03-29 08:01	Success	-	View
exp_pytrain.20260307184859.024_20260307_184938 Paper: pytrain.20260307184859.024	Benchmark: Strictly-Typed CLI Data Exporter README.md Benchmark: Strictly-Typed CLI Data Exporter This benchmark evaluates a Python implementation that adheres to strict static typing using `typing.TypeVar`, `typing.Generic`, and `typing.Protocol`. It verifies the robustness of a dat...	03-29 08:01	Success	-	View
exp_pytrain.20260307190316.025_20260307_190338 Paper: pytrain.20260307190316.025	Typing-First Configuration Module Benchmark This benchmark evaluates the creation of a robust, strictly typed configuration management system using only Python's standard library. Overview The goal is to implement a `ConfigLoader` that enforces schema validation using `typing.TypedDi...	03-29 08:01	Success	-	View
exp_pytrain.20260307190938.026_20260307_191002 Paper: pytrain.20260307190938.026	Typed Component Registry and Config Validator README.md Typed Component Registry and Config Validator Hypothesis: A generic registry pattern combined with runtime type introspection (using `typing` and `inspect`) can create a robust, self-validating factory system, reducing runtime...	03-29 08:01	Success	-	View
exp_pytrain.20260307191536.027_20260307_191607 Paper: pytrain.20260307191536.027	Strictly Typed Tensor Core with Module Encapsulation README.md Strictly Typed Tensor Core with Module Encapsulation Overview This coding drill benchmarks the implementation of a robust, strictly typed `Tensor` data structure using only the Python Standard Library. It demonstrates advanced typ...	03-29 08:01	Success	-	View
exp_pytrain.20260307192143.028_20260307_192225 Paper: pytrain.20260307192143.028	Python Skill Fallback Title: Strict Generic Box Package - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307192804.029_20260307_192822 Paper: pytrain.20260307192804.029	Self-Validating Package Scaffold Generator Benchmark README.md Self-Validating Package Scaffold Generator Benchmark This benchmark evaluates a Python script's ability to programmatically generate a standards-compliant Python package structure ("src-layout") based on a strict `TypedDict` confi...	03-29 08:01	Success	-	View
exp_pytrain.20260307193451.030_20260307_193515 Paper: pytrain.20260307193451.030	Strict Runtime Interface Validator README.md Strict Runtime Interface Validator Overview This coding drill benchmarks your ability to construct a robust Python module loader that guarantees strict adherence to a defined interface at runtime. It leverages `importlib` for dyna...	03-29 08:01	Success	-	View
exp_pytrain.20260307194040.031_20260307_194119 Paper: pytrain.20260307194040.031	Strictly-Typed Modular Log Aggregator README.md Strictly-Typed Modular Log Aggregator Design Hypothesis This benchmark tests the hypothesis that enforcing strict type annotations (TypedDict, Protocols) and separating CLI logic from core business logic within a single artifact i...	03-29 08:01	Success	-	View
exp_pytrain.20260307194710.032_20260307_194736 Paper: pytrain.20260307194710.032	Python Skill Fallback Title: Type-Safe Configuration Registry for Multi-Modal Models - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307195308.033_20260307_195347 Paper: pytrain.20260307195308.033	Strictly-Typed Kernel Loader Registry This repository contains a single-file Python benchmark designed to simulate a high-performance kernel loading system similar to those found in vLLM or PyTorch. Overview In systems requiring high throughput, computational kernels (e.g., mat...	03-29 08:01	Success	-	View
exp_pytrain.20260307195915.034_20260307_195945 Paper: pytrain.20260307195915.034	Dynamic Type-Safe Plugin Loader Benchmark README.md Dynamic Type-Safe Plugin Loader Benchmark Hypothesis An autonomous coding system can construct a robust, dependency-free plugin architecture using Python's standard library. By leveraging `typing.Protocol` for structural subtyping...	03-29 08:01	Success	-	View
exp_pytrain.20260307200553.035_20260307_200627 Paper: pytrain.20260307200553.035	Generic Data Processing Framework Benchmark README.md Generic Data Processing Framework Benchmark This benchmark evaluates a robust data processing pipeline implementation utilizing modern Python typing features introduced in PEP 695 (Type Parameter Syntax) and PEP 484 (Protocols). D...	03-29 08:01	Success	-	View
exp_pytrain.20260307201300.036_20260307_201329 Paper: pytrain.20260307201300.036	Typed Async Service Package Benchmark README.md Typed Async Service Package Benchmark Objective This benchmark evaluates a single-file Python script designed to function as a lightweight, installable-style package. The focus is on strict typing adherence, proper `asyncio` usage...	03-29 08:01	Success	-	View
exp_pytrain.20260307201955.001_20260307_202020 Paper: pytrain.20260307201955.001	Strictly Typed Dynamic Module Loader README.md Strictly Typed Dynamic Module Loader Objective: This benchmark tests the reliability and performance of a Python-based plugin loading system that leverages advanced `typing` features (Protocols and Generics) to enforce runtime...	03-29 08:01	Success	-	View
exp_pytrain.20260307202626.002_20260307_202654 Paper: pytrain.20260307202626.002	PEP 695 Generic Repository & Module Encapsulation Benchmark README.md PEP 695 Generic Repository & Module Encapsulation Benchmark This benchmark validates the implementation of a generic repository system using Python 3.12+ Type Parameter Syntax (PEP 695) and strict Module Encapsulation (`__...	03-29 08:01	Success	-	View
exp_pytrain.20260307203324.003_20260307_203350 Paper: pytrain.20260307203324.003	Type-Safe Dependency Resolver Engine This benchmark is designed to test a Python engineering system's ability to implement a robust, type-safe algorithm using only the standard library. Objective Create a dependency resolution engine that calculates the correct installation or...	03-29 08:01	Success	-	View
exp_pytrain.20260307204007.004_20260307_204033 Paper: pytrain.20260307204007.004	Benchmark: Strictly-Typed Generic Pipeline README.md Benchmark: Strictly-Typed Generic Pipeline Overview This benchmark implements a robust, single-file `DataPipeline` using Python's advanced static typing features. It demonstrates how Generics, Protocols, and Type Guards can be use...	03-29 08:01	Success	-	View
exp_pytrain.20260307205246.005_20260307_205322 Paper: pytrain.20260307205246.005	Strictly Typed Plugin Architecture Benchmark README.md Strictly Typed Plugin Architecture Benchmark This benchmark evaluates the design of a robust, extensible command registry within a single file, leveraging Python's `typing` module for strict interface enforcement and simulation of...	03-29 08:01	Success	-	View
exp_pytrain.20260307205852.006_20260307_205923 Paper: pytrain.20260307205852.006	Project: Dynamic Extension Loader with Protocol Verification Benchmark README.md Project: Dynamic Extension Loader with Protocol Verification Benchmark Description: This benchmark demonstrates a zero-dependency plugin architecture using Python's standard library. It programmatically generates a tempora...	03-29 08:01	Success	-	View
exp_pytrain.20260307210514.007_20260307_210544 Paper: pytrain.20260307210514.007	Python Skill Fallback Title: Typing-Driven Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307212344.008_20260307_212415 Paper: pytrain.20260307212344.008	Generic Asset Loader Benchmark This benchmark tests the creation of a robust, reusable generic asset loader using Python 3.12's new Type Parameter Syntax (PEP 695) and the modern `importlib.resources` API for packaging. Objectives 1. PEP 695 Implementation: Define cl...	03-29 08:01	Success	-	View
exp_pytrain.20260307213028.009_20260307_213107 Paper: pytrain.20260307213028.009	Title: Type-Safe Dynamic Module Loader Benchmark README.md Title: Type-Safe Dynamic Module Loader Benchmark Objective: Validate a dynamic module loading strategy using `typing.Protocol` for structural subtyping (duck typing) verification at runtime. Description: This benchmark...	03-29 08:01	Success	-	View
exp_pytrain.20260307213734.010_20260307_213807 Paper: pytrain.20260307213734.010	Python Skill Fallback Title: Typed Async Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307214405.011_20260307_214439 Paper: pytrain.20260307214405.011	Dynamic Virtual Package Loader with Generic Protocol Enforcement This benchmark demonstrates an advanced Python pattern involving the dynamic construction of Python modules in-memory without touching the filesystem, combined with structural subtyping (Protocol) enforcement. This mirrors how modern plugin...	03-29 08:01	Success	-	View
exp_pytrain.20260307215057.012_20260307_215121 Paper: pytrain.20260307215057.012	Robust Dynamic Plugin Registry Benchmark README.md Robust Dynamic Plugin Registry Benchmark This benchmark tests the hypothesis that an autonomous system can construct a robust, type-safe plugin architecture using Python's standard library. It mirrors the dynamic model loading mec...	03-29 08:01	Success	-	View
exp_pytrain.20260307215718.013_20260307_215756 Paper: pytrain.20260307215718.013	Strict Protocol Enforcement and Virtual Package Management Benchmark README.md Strict Protocol Enforcement and Virtual Package Management Benchmark Design Brief This benchmark simulates the internal architecture of robust AI libraries like vLLM or PyTorch. It focuses on the problem of dynamic backend...	03-29 08:01	Success	-	View
exp_pytrain.20260307221100.014_20260307_221137 Paper: pytrain.20260307221100.014	Benchmark: Strictly-Typed Recipe Executor with Metadata Validation README.md Benchmark: Strictly-Typed Recipe Executor with Metadata Validation This benchmark tests the ability to write robust, production-grade Python code that enforces strict typing using modern type hinting features (`Protocol`, `Generic...	03-29 08:01	Success	-	View
exp_pytrain.20260307221746.015_20260307_221819 Paper: pytrain.20260307221746.015	Python Skill Fallback Title: Type-Safe Generic Resource Pool with Modern Packaging Hygiene - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307222338.016_20260307_222410 Paper: pytrain.20260307222338.016	Coding Drill: Strict Typed Data Ingestion Module README.md Coding Drill: Strict Typed Data Ingestion Module Objective This benchmark evaluates the candidate's ability to construct a robust, production-ready Python data processing module using strictly the Standard Library. The focus is on...	03-29 08:01	Success	-	View
exp_pytrain.20260307223031.017_20260307_223107 Paper: pytrain.20260307223031.017	Type-Safe Dynamic Plugin System A Python benchmark demonstrating advanced packaging and typing capabilities by implementing a dynamic discovery system. The system loads code from a virtual package structure at runtime, enforcing strict interface compliance using `typing.P...	03-29 08:01	Success	-	View
exp_pytrain.20260307223640.018_20260307_223705 Paper: pytrain.20260307223640.018	Dynamic Package Loader with Runtime Type Enforcement README.md Title: Dynamic Package Loader with Runtime Type Enforcement Objective This benchmark tests a Python engineer's ability to programmatically manipulate the Python import system, construct valid in-memory package structures, and enfo...	03-29 08:01	Success	-	View
exp_pytrain.20260307224307.019_20260307_224332 Paper: pytrain.20260307224307.019	Robust Dynamic Plugin Loader with Protocol Validation README.md Robust Dynamic Plugin Loader with Protocol Validation Overview This benchmark demonstrates the construction of a modular, extensible application architecture using Python's standard library. It simulates a plugin system where modu...	03-29 08:01	Success	-	View
exp_pytrain.20260307225831.020_20260307_225901 Paper: pytrain.20260307225831.020	Generic Component Registry Benchmark README.md Generic Component Registry Benchmark Overview This benchmark tests the ability of an autonomous coding agent to construct a sophisticated, type-safe plugin system using only the Python standard library. Core Concepts The system ut...	03-29 08:01	Success	-	View
exp_pytrain.20260307230500.021_20260307_230535 Paper: pytrain.20260307230500.021	Benchmark: Robust Dynamic Module Loader with TypeGuard Validation README.md Benchmark: Robust Dynamic Module Loader with TypeGuard Validation Overview This benchmark tests a Python engine's ability to programmatically generate a file-system package structure, dynamically import it using `importlib`, a...	03-29 08:01	Success	-	View
exp_pytrain.20260307231227.022_20260307_231251 Paper: pytrain.20260307231227.022	--- README.md --- Modern Generic Distribution Inspector Hypothesis Adopting PEP 695 Type Parameter Syntax simplifies the definition of generic container classes and type aliases, reducing the boilerplate and cognitive load associated with legac...	03-29 08:01	Success	-	View
exp_pytrain.20260307231904.023_20260307_231938 Paper: pytrain.20260307231904.023	Strictly-Typed Async Worker Module Benchmark README.md Strictly-Typed Async Worker Module Benchmark This benchmark evaluates a Python system's ability to structure a professional, single-file software package. It specifically targets strict type usage (Generics), public API definition...	03-29 08:01	Success	-	View
exp_pytrain.20260307232519.024_20260307_232550 Paper: pytrain.20260307232519.024	Strict Package Metadata Validator README.md Strict Package Metadata Validator Overview This coding drill benchmark tests an autonomous coding system's ability to utilize Python's static typing system, specifically `TypedDict` and strict type checking protocols. The script i...	03-29 08:01	Success	-	View
exp_pytrain.20260307233124.025_20260307_233153 Paper: pytrain.20260307233124.025	Strict Package API Validator Benchmark README.md Strict Package API Validator Benchmark Overview This coding drill benchmarks a robust, dependency-free implementation of a Package API Validator. The goal is to enforce packaging hygiene and type safety at runtime by validatin...	03-29 08:01	Success	-	View
exp_pytrain.20260307234613.026_20260307_234649 Paper: pytrain.20260307234613.026	Python Skill Fallback Title: Dynamic Component Loader with Strict Typing and Dependency Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260307235226.027_20260307_235254 Paper: pytrain.20260307235226.027	Generic Training Pipeline with Runtime Protocol Validation README.md Generic Training Pipeline with Runtime Protocol Validation This benchmark evaluates the implementation of a strictly typed, mock machine learning training pipeline using Python's standard library advanced typing features. Objectiv...	03-29 08:01	Success	-	View
exp_pytrain.20260307235846.028_20260307_235921 Paper: pytrain.20260307235846.028	Python Skill Fallback Title: Strictly-Typed Application Configuration Manager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308000500.029_20260308_000533 Paper: pytrain.20260308000500.029	Python Skill Fallback Title: Runtime Plugin Discovery with Strict Protocol Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308001106.030_20260308_001128 Paper: pytrain.20260308001106.030	Strict Typing and Module Structure for Async Handlers Overview This benchmark evaluates your ability to construct a robust, distributable Python library module (`handler_lib.py`) that adheres to strict type-checking protocols and packaging conventions. Objectives 1. Module Structure: Prope...	03-29 08:01	Success	-	View
exp_pytrain.20260308001745.031_20260308_001808 Paper: pytrain.20260308001745.031	Python Skill Fallback Title: Type-Safe Backend Dispatcher with Namespace Isolation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308002412.032_20260308_002447 Paper: pytrain.20260308002412.032	Strictly Typed Dynamic Configuration Dispatcher This benchmark simulates the core of a lightweight ML framework where model components are instantiated dynamically based on type-safe configurations. It relies on Python's `typing.Protocol` for interface definition and `typing.get_type_hin...	03-29 08:01	Success	-	View
exp_pytrain.20260308003042.033_20260308_003114 Paper: pytrain.20260308003042.033	Typed Configuration Schema and Runtime Dependency Validator README.md Typed Configuration Schema and Runtime Dependency Validator Objective This benchmark tests the ability to design a robust, type-safe configuration management module using standard Python libraries. It simulates the initialization...	03-29 08:01	Success	-	View
exp_pytrain.20260308003708.034_20260308_003745 Paper: pytrain.20260308003708.034	Strictly-Typed Dynamic Plugin Loader README.md Strictly-Typed Dynamic Plugin Loader Objective This benchmark evaluates the ability to write a robust, modular Python system using advanced type hinting features (`typing.Protocol`, `typing.TypeVar`) and reflection tools (`importl...	03-29 08:01	Success	-	View
exp_pytrain.20260308004344.035_20260308_004415 Paper: pytrain.20260308004344.035	Type-Safe Dynamic Plugin Loader Benchmark README.md Type-Safe Dynamic Plugin Loader Benchmark Objective This benchmark evaluates a Python 3.12+ implementation of a dynamic plugin system that enforces structural type safety at runtime without external dependencies. Technical Context...	03-29 08:01	Success	-	View
exp_pytrain.20260308005017.036_20260308_005056 Paper: pytrain.20260308005017.036	Section 1: README.md Strict Type-Safe Package Scaffolder This benchmark evaluates your ability to design robust, type-safe Python filesystem tooling using modern standard library features (`dataclasses`, `Protocol`, `pathlib`). Objective Create a CLI tool that...	03-29 08:01	Success	-	View
exp_pytrain.20260308005706.037_20260308_005731 Paper: pytrain.20260308005706.037	Python Skill Fallback Title: Metadata-Aware Typed Dispatcher - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308010332.038_20260308_010405 Paper: pytrain.20260308010332.038	Strictly-Typed Dynamic Package Loader and Validator README.md Strictly-Typed Dynamic Package Loader and Validator Overview This benchmark evaluates a Python system's capability to dynamically generate Python packages in a temporary filesystem, load them using `importlib`, and enforce strict...	03-29 08:01	Success	-	View
exp_pytrain.20260308011000.039_20260308_011032 Paper: pytrain.20260308011000.039	--- README.md StrictlyTypedAutoRegistry Benchmark Overview This benchmark implements a strictly-typed, plugin-based model registry system similar to the architecture found in Hugging Face Transformers or Diffusers, utilizing only the Py...	03-29 08:01	Success	-	View
exp_pytrain.20260308011620.040_20260308_011643 Paper: pytrain.20260308011620.040	Title: Dynamic Plugin Registry with Type-Safe Dispatch README.md Title: Dynamic Plugin Registry with Type-Safe Dispatch Description: This benchmark evaluates an autonomous coding agent's ability to construct a robust, extensible plugin architecture using the Python standard library. The...	03-29 08:01	Success	-	View
exp_pytrain.20260308012254.041_20260308_012315 Paper: pytrain.20260308012254.041	Benchmark: Runtime Package Construction with Generic Protocol Enforcement README.md Benchmark: Runtime Package Construction with Generic Protocol Enforcement Overview This benchmark validates an autonomous system's ability to synthesize a valid Python package structure at runtime. It dynamically generates source...	03-29 08:01	Success	-	View
exp_pytrain.20260308012937.042_20260308_013007 Paper: pytrain.20260308012937.042	PEP 695 Generic Repository Implementation Benchmark README.md PEP 695 Generic Repository Implementation Benchmark This benchmark demonstrates the utilization of PEP 695 (Type Parameter Syntax) introduced in Python 3.12. It implements a thread-safe, generic in-memory `Repository` class us...	03-29 08:01	Success	-	View
exp_pytrain.20260308013606.043_20260308_013634 Paper: pytrain.20260308013606.043	Protocol-Based Dynamic Plugin Loader Overview This benchmark validates a robust, modular Python architecture that enables runtime extensibility without tight coupling. It utilizes `typing.Protocol` to define structural interfaces and `importlib` to dynamically load code from a...	03-29 08:01	Success	-	View
exp_pytrain.20260308015640.044_20260308_015700 Paper: pytrain.20260308015640.044	Strictly-Typed Plugin Registry with Runtime Validation README.md Strictly-Typed Plugin Registry with Runtime Validation Design Brief This benchmark validates a Python engineer's ability to construct a robust, extensible architecture using Python's advanced type system (Protocols, Generics) and...	03-29 08:01	Success	-	View
exp_pytrain.20260308030305.045_20260308_030334 Paper: pytrain.20260308030305.045	Robust Plugin Registry with Structural Subtyping README.md Robust Plugin Registry with Structural Subtyping Hypothesis Utilizing structural subtyping (`typing.Protocol`) for package interfaces decouples implementation details from definition. This facilitates independent development and t...	03-29 08:01	Success	-	View
exp_pytrain.20260308031001.046_20260308_031029 Paper: pytrain.20260308031001.046	Strictly-Typed Plugin Registry Benchmark README.md Strictly-Typed Plugin Registry Benchmark Overview This benchmark evaluates a robust `PluginRegistry` implementation designed for modular ML pipelines. It emphasizes strict type safety using Python's `typing.Protocol` and `typing.T...	03-29 08:01	Success	-	View
exp_pytrain.20260308031602.047_20260308_031628 Paper: pytrain.20260308031602.047	Dynamic Plugin Registry with Strict Structural Subtyping This benchmark evaluates a Python engine's capability to dynamically construct a modular architecture using runtime code generation and strict structural subtyping (Protocols). Overview In modern MLOps systems, pipelines are often composed...	03-29 08:01	Success	-	View
exp_pytrain.20260308032213.048_20260308_032246 Paper: pytrain.20260308032213.048	--- README.md --- Generic Dependency Resolver and Module Structure Simulation Overview This coding drill benchmark, `benchmark.py`, implements `mini_installer.py` as a self-contained, type-safe Python module. It simulates a minimal package mana...	03-29 08:01	Success	-	View
exp_pytrain.20260308032821.049_20260308_032848 Paper: pytrain.20260308032821.049	Strictly-Typed Python Package Scaffolder Overview This coding drill benchmarks the ability to construct a robust, file-system generator that strictly enforces data schemas before execution. The goal is to implement a standalone executable script (embedded within this benchmark) th...	03-29 08:01	Success	-	View
exp_pytrain.20260308033504.050_20260308_033541 Paper: pytrain.20260308033504.050	Type-Safe Dynamic Plugin Discovery System README.md Type-Safe Dynamic Plugin Discovery System This benchmark validates a Python system that simulates an autonomous package distribution and import workflow. It programmatically generates a Python package structure on the disk, enforc...	03-29 08:01	Success	-	View
exp_pytrain.20260308034241.051_20260308_034306 Paper: pytrain.20260308034241.051	Dynamic Module Loader and Strict Interface Verifier README.md Dynamic Module Loader and Strict Interface Verifier This benchmark evaluates the ability of a Python system to dynamically load code from a string source and rigorously validate its adherence to a `typing.Protocol` interface. Hypo...	03-29 08:01	Success	-	View
exp_pytrain.20260308035511.052_20260308_035538 Paper: pytrain.20260308035511.052	```markdown README.md bash python benchmark.py ---	03-29 08:01	Success	-	View
exp_pytrain.20260308040139.053_20260308_040211 Paper: pytrain.20260308040139.053	Dynamic Backend Loader with Type Protocol Validation This benchmark simulates a high-performance plugin architecture commonly found in systems like vLLM or PyTorch, where backends (CUDA, CPU, FlashAttention implementations) are loaded dynamically based on availability or user configuration. T...	03-29 08:01	Success	-	View
exp_pytrain.20260308040734.054_20260308_040800 Paper: pytrain.20260308040734.054	Python Skill Fallback Title: Generic Plugin Registry with Dynamic Module Loading - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308041409.055_20260308_041435 Paper: pytrain.20260308041409.055	Dynamic Package Construction and Type Introspection Benchmark README.md Dynamic Package Construction and Type Introspection Benchmark Overview This benchmark evaluates an autonomous coding system's ability to leverage Python 3.12+ features, specifically PEP 695 (Type Parameter Syntax). The system must...	03-29 08:01	Success	-	View
exp_pytrain.20260308042050.056_20260308_042120 Paper: pytrain.20260308042050.056	Strictly-Typed Dynamic Plugin Loader README.md Strictly-Typed Dynamic Plugin Loader This coding drill benchmarks the creation of a robust, dynamic extension system using Python's standard library. Context Traditional plugin architectures in Python often rely on loose conventio...	03-29 08:01	Success	-	View
exp_pytrain.20260308042711.057_20260308_042738 Paper: pytrain.20260308042711.057	Type-Safe Plugin Registry & Package Mock Benchmark This benchmark evaluates the ability of a system to construct a valid Python package structure using standard library typing features. The script simulates a distributable library `datatools` that defines a strict Protocol interface, discov...	03-29 08:01	Success	-	View
exp_pytrain.20260308043336.058_20260308_043358 Paper: pytrain.20260308043336.058	Type-Safe Python Package Scaffolder Benchmark README.md Type-Safe Python Package Scaffolder Benchmark Description This benchmark evaluates the generation of a robust, type-safe Python CLI tool that automates the creation of standard Python package structures. Goal The solution...	03-29 08:01	Success	-	View
exp_pytrain.20260308044009.059_20260308_044030 Paper: pytrain.20260308044009.059	Python Skill Fallback Title: Strictly-Typed Dynamic Component Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308044620.060_20260308_044648 Paper: pytrain.20260308044620.060	Strictly-Typed Operation Registry & CLI README.md Strictly-Typed Operation Registry & CLI This repository contains a single-file Python package (`benchmark.py`) that demonstrates a robust, strictly-typed plugin architecture using Python's `typing.Protocol`, `TypeVar`, and `Generi...	03-29 08:01	Success	-	View
exp_pytrain.20260308045216.061_20260308_045247 Paper: pytrain.20260308045216.061	Python Skill Fallback Title: Strict Typing Runtime Validator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308045836.062_20260308_045910 Paper: pytrain.20260308045836.062	Python Skill Fallback Title: PEP 695 Generic Service Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308050541.063_20260308_050610 Paper: pytrain.20260308050541.063	```markdown bash python3 benchmark.py	03-29 08:01	Success	-	View
exp_pytrain.20260308051348.064_20260308_051419 Paper: pytrain.20260308051348.064	Structural Subtyping Validator for Dynamic Modules README.md Structural Subtyping Validator for Dynamic Modules Overview This benchmark tests the implementation of a robust, structural subtyping system using Python's `typing.Protocol`. Unlike nominal typing (inheritance), structural typing...	03-29 08:01	Success	-	View
exp_pytrain.20260308051953.065_20260308_052035 Paper: pytrain.20260308051953.065	Strictly-Typed Modular Plugin Dispatcher Benchmark README.md Strictly-Typed Modular Plugin Dispatcher Benchmark This benchmark evaluates a Python engineer's ability to construct a self-contained, strictly-typed plugin ecosystem using the standard library. Objectives 1. **Protocol Enforcemen...	03-29 08:01	Success	-	View
exp_pytrain.20260308052708.066_20260308_052740 Paper: pytrain.20260308052708.066	This benchmark focuses on the creation of a robust, strictly typed configuration module for a high-performance inference... README.md This benchmark focuses on the creation of a robust, strictly typed configuration module for a high-performance inference engine, similar to architectures found in vLLM or FlashAttention. Objective The goal is to demonstrate ho...	03-29 08:01	Success	-	View
exp_pytrain.20260308053312.067_20260308_053344 Paper: pytrain.20260308053312.067	--- README.md Benchmark: Robustly Typed Module Design Objective This benchmark evaluates your ability to design a robust, self-contained Python library that adheres to strict packaging and typing standards. It focuses on using Python's type sys...	03-29 08:01	Success	-	View
exp_pytrain.20260308053932.068_20260308_054005 Paper: pytrain.20260308053932.068	--- README.md --- Strictly-Typed Plugin Loader Benchmark Objective: Evaluate the performance and robustness of a dynamic plugin loading system that utilizes Python 3.12's PEP 695 Type Parameter Syntax, PEP 484 Type Hints, and `typing.Protoc...	03-29 08:01	Success	-	View
exp_pytrain.20260308055111.069_20260308_055140 Paper: pytrain.20260308055111.069	Python Skill Fallback Title: Strict Package Interface Verifier - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308055738.070_20260308_055812 Paper: pytrain.20260308055738.070	```markdown README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_pytrain.20260308060403.071_20260308_060432 Paper: pytrain.20260308060403.071	Type-Safe Python Package Scaffolder Benchmark README.md Type-Safe Python Package Scaffolder Benchmark This benchmark evaluates the implementation of a robust, type-safe CLI tool for generating Python package scaffolds. It emphasizes the use of modern Python typing constructs (`TypedDic...	03-29 08:01	Success	-	View
exp_pytrain.20260308061019.072_20260308_061042 Paper: pytrain.20260308061019.072	Python Skill Fallback Title: Type-Safe Generic Registry with Dynamic Dependency Simulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308061627.073_20260308_061659 Paper: pytrain.20260308061627.073	Strictly-Typed Backend Dispatcher README.md Strictly-Typed Backend Dispatcher Design Brief This benchmark evaluates a Python system's ability to design a robust internal package structure that simulates a 'hardware dispatcher' (similar to `vllm` or `flash-attention` selecti...	03-29 08:01	Success	-	View
exp_pytrain.20260308062920.074_20260308_062959 Paper: pytrain.20260308062920.074	Section 1: README.md Strictly Typed Dependency Constraint Resolver Overview This benchmark tests a developer's ability to implement a core algorithm (dependency resolution) using Python's advanced type system features. The goal is to create a robust, subset-com...	03-29 08:01	Success	-	View
exp_pytrain.20260308063735.001_20260308_063809 Paper: pytrain.20260308063735.001	Virtual Package Construction with Generic Protocols Objective This benchmark evaluates a system's ability to programmatically synthesize a valid Python package structure on the filesystem while strictly adhering to PEP 484 typing standards (specifically Generics and Protocols). Design Brief...	03-29 08:01	Success	-	View
exp_pytrain.20260308064535.001_20260308_064611 Paper: pytrain.20260308064535.001	Structural Plugin Loader Benchmark README.md Structural Plugin Loader Benchmark Overview This benchmark evaluates a system's ability to implement a modular, type-safe plugin architecture using Python's standard library. It focuses on `typing.Protocol` for structural subtypin...	03-29 08:01	Success	-	View
exp_pytrain.20260308065154.002_20260308_065226 Paper: pytrain.20260308065154.002	pytrain.20260308065154.002 No summary available yet.	03-29 08:01	Success	-	View
exp_pytrain.20260308065800.003_20260308_065823 Paper: pytrain.20260308065800.003	Strict Typed Dynamic Plugin Loader This benchmark evaluates a Python script's ability to perform robust dynamic module loading and verification using Python's type system. Objective The script demonstrates how to safely load external code (plugins) at runtime. It leverages `...	03-29 08:01	Success	-	View
exp_pytrain.20260308070419.004_20260308_070502 Paper: pytrain.20260308070419.004	Strictly Typed Data Ingestion Module Benchmark README.md Strictly Typed Data Ingestion Module Benchmark Objective This benchmark evaluates the correctness and performance of a Python module (`ingestor.py`) designed with strict typing standards. The module utilizes `TypedDict`, `Protocol...	03-29 08:01	Success	-	View
exp_pytrain.20260308071045.005_20260308_071119 Paper: pytrain.20260308071045.005	```markdown README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_pytrain.20260308071811.006_20260308_071845 Paper: pytrain.20260308071811.006	Python Skill Fallback Title: Strictly Typed Component Registry for Simulation Engine - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308072454.007_20260308_072523 Paper: pytrain.20260308072454.007	Dynamic Generic Package Builder This benchmark tests the ability of a system to programmatically scaffold a valid Python package structure, handle relative imports, and verify runtime behavior of Generic types. Instructions 1. Save the code below into a file named `benchm...	03-29 08:01	Success	-	View
exp_pytrain.20260308073130.008_20260308_073213 Paper: pytrain.20260308073130.008	```markdown bash python benchmark.py ``` Expected Output The script will generate temporary files, load plugins, process data, print performance metrics, and conclude with a `VERIFIED` status.	03-29 08:01	Success	-	View
exp_pytrain.20260308073819.009_20260308_073843 Paper: pytrain.20260308073819.009	--- README.md Dynamic Type-Verified Plugin System Benchmark Overview This benchmark tests the hypothesis that structural subtyping (using `typing.Protocol`) combined with dynamic module loading (using `importlib`) allows for the creatio...	03-29 08:01	Success	-	View
exp_pytrain.20260308075149.010_20260308_075217 Paper: pytrain.20260308075149.010	Strictly Typed Asynchronous Plugin Loader README.md Strictly Typed Asynchronous Plugin Loader Overview This coding drill evaluates a Python system's ability to simulate a distributable package structure while enforcing strict type safety using `typing.Protocol` and `typing.Generic`...	03-29 08:01	Success	-	View
exp_pytrain.20260308075803.011_20260308_075841 Paper: pytrain.20260308075803.011	Modular Log Analysis Toolkit Benchmark README.md Modular Log Analysis Toolkit Benchmark Overview This coding drill evaluates the ability to construct a robust, single-file Python executable that mimics a professional package structure. The solution implements a text processing t...	03-29 08:01	Success	-	View
exp_pytrain.20260308080411.012_20260308_080440 Paper: pytrain.20260308080411.012	--- README.md Typed Component Registry Benchmark Overview This benchmark tests the ability to design a robust, modular, and type-safe component registry system using Python's `typing` module. It simulates the architecture found in large-sca...	03-29 08:01	Success	-	View
exp_pytrain.20260308081041.013_20260308_081109 Paper: pytrain.20260308081041.013	Generic Plugin Registry Benchmark Overview This benchmark demonstrates a high-performance, type-safe plugin architecture suitable for large-scale Python applications (such as inference engines or data pipelines). It leverages Python's `typing.Protocol` for structural subtyp...	03-29 08:01	Success	-	View
exp_pytrain.20260308081702.014_20260308_081727 Paper: pytrain.20260308081702.014	Python Skill Fallback Title: Dynamic Plugin Loader with Structural Subtyping - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308082333.015_20260308_082413 Paper: pytrain.20260308082333.015	Strict Typed Package Scaffolder README.md Strict Typed Package Scaffolder Overview This benchmark tests the ability of a coding system to leverage modern Python 3.12+ features, specifically PEP 695 (Type Parameter Syntax) and strict typing protocols, to construct a ro...	03-29 08:01	Success	-	View
exp_pytrain.20260308083006.016_20260308_083026 Paper: pytrain.20260308083006.016	Python Skill Fallback Title: Type-Safe Dynamic Module Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308083613.017_20260308_083639 Paper: pytrain.20260308083613.017	pytrain.20260308083613.017 No summary available yet.	03-29 08:01	Success	-	View
exp_pytrain.20260308084218.018_20260308_084250 Paper: pytrain.20260308084218.018	Type-Safe Plugin Loader with Runtime Validation README.md Type-Safe Plugin Loader with Runtime Validation Overview This coding drill benchmark tests the ability to design a robust, extensible module loader using Python's `typing.Protocol` and `@runtime_checkable` decorators. The goal is...	03-29 08:01	Success	-	View
exp_pytrain.20260308084928.019_20260308_085019 Paper: pytrain.20260308084928.019	Lazy Backend Loader - Coding Drill Benchmark This document outlines a coding drill designed to test knowledge of Python's `typing.Protocol`, `importlib`, and exception handling within the context of building a lazy-loading system for heavy machine-learning backends (simulating framewo...	03-29 08:01	Success	-	View
exp_pytrain.20260308090032.020_20260308_090100 Paper: pytrain.20260308090032.020	Dynamic Configuration Loader with Strict Typing and Virtual Packaging README.md Dynamic Configuration Loader with Strict Typing and Virtual Packaging This benchmark validates the design of a scalable, PyTorch-like experiment framework skeleton. It tests the core engineering skills required to build large-scal...	03-29 08:01	Success	-	View
exp_pytrain.20260308090705.021_20260308_090735 Paper: pytrain.20260308090705.021	Python Reliability Drill: Typing & Packaging README.md Python Reliability Drill: Typing & Packaging This benchmark suite, `benchmark.py`, is designed to validate robustness in Python type handling and module packaging structures without external dependencies. It simulates a high-perfo...	03-29 08:01	Success	-	View
exp_pytrain.20260308091342.022_20260308_091420 Paper: pytrain.20260308091342.022	Benchmark: PEP 695 Generic Registry and ZipApp Deployment README.md Benchmark: PEP 695 Generic Registry and ZipApp Deployment Objective This benchmark validates the developer's ability to utilize PEP 695 Type Parameter Syntax to define robust, thread-safe generic classes and package them as a...	03-29 08:01	Success	-	View
exp_pytrain.20260308092015.023_20260308_092052 Paper: pytrain.20260308092015.023	Strictly Typed Dynamic Module Inspector README.md Strictly Typed Dynamic Module Inspector This Python coding drill demonstrates the creation of a robust utility that leverages the `typing.Protocol` for structural subtyping and `importlib` for runtime introspection. Hypothesis An...	03-29 08:01	Success	-	View
exp_pytrain.20260308092659.024_20260308_092740 Paper: pytrain.20260308092659.024	Here is the design for the coding drill benchmark focusing on a Robust Dynamic Plugin Loader with Runtime Type Verificat... README.md Dynamic Plugin Loader & Runtime Type Verification Benchmark Overview This benchmark demonstrates the creation of a robust, modular Python system that dynamically loads code at runtime. It leverages Python's `importlib` for runtime...	03-29 08:01	Success	-	View
exp_pytrain.20260308094751.025_20260308_094818 Paper: pytrain.20260308094751.025	Dynamic Module Loader with Protocol Validation README.md Dynamic Module Loader with Protocol Validation Overview This benchmark tests the ability to construct a robust, type-safe dynamic plugin system using Python's standard library. The solution demonstrates advanced `typing.Protocol`...	03-29 08:01	Success	-	View
exp_pytrain.20260308095422.026_20260308_095453 Paper: pytrain.20260308095422.026	Strict Typed Artifact Packager Benchmark README.md Strict Typed Artifact Packager Benchmark Overview This benchmark evaluates the engineer's ability to construct robust deployment pipelines using Python's `typing` module and file-system management utilities. Hypothesis: Robust...	03-29 08:01	Success	-	View
exp_pytrain.20260308100025.027_20260308_100056 Paper: pytrain.20260308100025.027	Python Engineering Drill: Dynamic Component Registry README.md Python Engineering Drill: Dynamic Component Registry Objective This benchmark tests the ability to implement a robust, type-safe plugin system using only the Python Standard Library. Focus Areas: 1. Advanced Typing: Correc...	03-29 08:01	Success	-	View
exp_pytrain.20260308100637.028_20260308_100710 Paper: pytrain.20260308100637.028	Python Skill Fallback Title: Generic Package Registry with PEP 695 - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308101322.029_20260308_101408 Paper: pytrain.20260308101322.029	Strictly Typed Plugin Architecture Simulation README.md Strictly Typed Plugin Architecture Simulation Hypothesis An autonomous system can effectively internalize modern Python typing and packaging concepts by constructing a lightweight, extensible plugin system using only the standard...	03-29 08:01	Success	-	View
exp_pytrain.20260308102615.030_20260308_102643 Paper: pytrain.20260308102615.030	Asynchronous Log Aggregator with Strict Typing Overview This benchmark evaluates the effectiveness of combining Python's `asyncio` library with strict static typing (`typing.TypedDict`, `dataclasses`) for building a simulated high-throughput log processing pipeline. The hypothesis is th...	03-29 08:01	Success	-	View
exp_pytrain.20260308103242.031_20260308_103308 Paper: pytrain.20260308103242.031	```markdown Dynamic Module Loader with Strict Protocol Validation Overview This coding drill tests the ability to design a robust plugin system using Python's standard library. The focus is on dynamic code discovery/loading using `importlib` and enforc...	03-29 08:01	Success	-	View
exp_pytrain.20260308103942.032_20260308_104016 Paper: pytrain.20260308103942.032	Strict-Typed Component Factory Benchmark README.md Strict-Typed Component Factory Benchmark This benchmark validates a candidate's ability to structure a Python module that simulates a professional package architecture. It focuses on strict typing using `typing.Protocol`, proper e...	03-29 08:01	Success	-	View
exp_pytrain.20260308104634.033_20260308_104701 Paper: pytrain.20260308104634.033	Dynamic Module Discovery with Structural Subtyping Benchmark README.md Dynamic Module Discovery with Structural Subtyping Benchmark Overview This benchmark tests a robust plugin architecture hypothesis: using `typing.Protocol` with `runtime_checkable` provides a more flexible and decoupled method for...	03-29 08:01	Success	-	View
exp_pytrain.20260308105258.034_20260308_105331 Paper: pytrain.20260308105258.034	Python Skill Fallback Title: Dynamic Model Registry with Structural Subtyping - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308105906.035_20260308_105941 Paper: pytrain.20260308105906.035	Generic Plugin Registry with Dynamic Module Loading This benchmark evaluates the performance and type safety of a generic plugin registry system using Python 3.12's PEP 695 Type Parameter Syntax. Features - PEP 695 Syntax: Uses `class PluginRegistry[T]` for cleaner generic definitions. -...	03-29 08:01	Success	-	View
exp_pytrain.20260308111448.036_20260308_111519 Paper: pytrain.20260308111448.036	Dynamic Module Loader with Strict Protocol Compliance README.md Dynamic Module Loader with Strict Protocol Compliance Overview This benchmark evaluates a robust package loading mechanism designed for dynamic plugin systems. The implementation demonstrates how an autonomous agent can construct...	03-29 08:01	Success	-	View
exp_pytrain.20260308112106.037_20260308_112129 Paper: pytrain.20260308112106.037	Generic Data Pipeline Benchmark README.md Generic Data Pipeline Benchmark This coding drill evaluates the implementation of a robust, type-safe data pipeline using Python's advanced standard library features. Objective The goal is to design a single-file module (`benchmar...	03-29 08:01	Success	-	View
exp_pytrain.20260308112731.038_20260308_112804 Paper: pytrain.20260308112731.038	Strictly Typed Plugin Registry Overview This benchmark challenges you to implement a robust, modular plugin architecture in Python using modern type hinting features. The goal is to create a system that enforces strict structural typing (Protocols) and type-safe storage...	03-29 08:01	Success	-	View
exp_pytrain.20260308113407.001_20260308_113441 Paper: pytrain.20260308113407.001	Robust Typed Plugin Loader: Benchmark & Verification README.md Robust Typed Plugin Loader: Benchmark & Verification Objective This benchmark evaluates a Python-based plugin architecture that relies on `typing.Protocol` for structural subtyping (duck typing with static type checking) combined...	03-29 08:01	Success	-	View
exp_pytrain.20260308114047.002_20260308_114111 Paper: pytrain.20260308114047.002	Generic Plugin Registry with PEP 695 - Benchmark Drill This benchmark validates the implementation of a generic plugin system using Python 3.12's Type Parameter Syntax (PEP 695). It tests syntax compliance, functional correctness of the generic registry, and runtime performance metrics. Accepta...	03-29 08:01	Success	-	View
exp_pytrain.20260308114710.003_20260308_114741 Paper: pytrain.20260308114710.003	In-Memory Plugin Loader with Strict Protocols README.md In-Memory Plugin Loader with Strict Protocols This benchmark implements a robust, file-system-free plugin architecture using Python's standard library. It demonstrates the creation of a custom import mechanism that loads Python mo...	03-29 08:01	Success	-	View
exp_pytrain.20260308120314.001_20260308_120342 Paper: pytrain.20260308120314.001	Structural Subtyping and Mock Package Registry Benchmark README.md Structural Subtyping and Mock Package Registry Benchmark Objective This benchmark evaluates a Python system's ability to leverage Structural Subtyping (using `typing.Protocol` and `@runtime_checkable`) to create a robust, zero...	03-29 08:01	Success	-	View
exp_pytrain.20260308120715.001_20260308_120745 Paper: pytrain.20260308120715.001	--- README.md Dynamic Plugin Loader with Strict Type Enforcement Overview This benchmark validates a zero-dependency, robust plugin architecture implementation using Python's standard library. It demonstrates dynamic module compilation, runtime...	03-29 08:01	Success	-	View
exp_pytrain.20260308121343.002_20260308_121418 Paper: pytrain.20260308121343.002	Python Skill Fallback Title: Modern Generic Plugin Loader with PEP 695 - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308122037.003_20260308_122053 Paper: pytrain.20260308122037.003	Benchmark: Strict Type-Verified Plugin Registry An autonomous coding system can simulate the robustness of a package distribution system by implementing a runtime registry that utilizes structural subtyping (Protocols) to validate interfaces. This ensures that only strictly compliant mod...	03-29 08:01	Success	-	View
exp_pytrain.20260308122648.004_20260308_122712 Paper: pytrain.20260308122648.004	Type-Safe Plugin Architecture with Namespace Management README.md Type-Safe Plugin Architecture with Namespace Management Design Brief This coding drill validates the hypothesis that utilizing `typing.Protocol` (Structural Subtyping) combined with explicit Namespace Management (`__all__`) provid...	03-29 08:01	Success	-	View
exp_pytrain.20260308123236.005_20260308_123255 Paper: pytrain.20260308123236.005	Typing-First Dynamic Module Loader Overview This benchmark evaluates an agent's ability to leverage Python's advanced type hinting features (specifically `typing.Protocol` and `@runtime_checkable`) to enforce structural subtyping (duck typing) at runtime. The task involves s...	03-29 08:01	Success	-	View
exp_pytrain.20260308124805.006_20260308_124833 Paper: pytrain.20260308124805.006	Type-Safe Plugin Registry Benchmark README.md Type-Safe Plugin Registry Benchmark This benchmark simulates the core functionality of complex ML frameworks (like Diffusers or vLLM) that rely on dynamic component discovery and strict interface adherence. Objective Implement a `...	03-29 08:01	Success	-	View
exp_pytrain.20260308125419.007_20260308_125437 Paper: pytrain.20260308125419.007	Strictly Typed Plugin Registry and Package Simulator README.md Strictly Typed Plugin Registry and Package Simulator Overview This benchmark tests the ability to design a robust, dependency-free component registry using Python's advanced `typing` features. It simulates a professional Python pa...	03-29 08:01	Success	-	View
exp_pytrain.20260308130014.008_20260308_130040 Paper: pytrain.20260308130014.008	Dynamic In-Memory Package Loader with Generic Registry README.md Dynamic In-Memory Package Loader with Generic Registry This benchmark evaluates the implementation of an advanced Python packaging mechanism where software distribution is simulated entirely in memory, alongside strict type safety...	03-29 08:01	Success	-	View
exp_pytrain.20260308132049.001_20260308_132120 Paper: pytrain.20260308132049.001	Python Skill Fallback Title: Generic Repository Pattern with Packaging Hygiene - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308132722.002_20260308_132754 Paper: pytrain.20260308132722.002	Generic Plugin Loader with PEP 695 This benchmark validates the use of Python 3.12's PEP 695 Type Parameter Syntax to define a generic plugin interface. It dynamically constructs a Python package in a temporary directory, creates a plugin module, and loads it using `importli...	03-29 08:01	Success	-	View
exp_pytrain.20260308133345.003_20260308_133425 Paper: pytrain.20260308133345.003	Type-Safe Plugin Registry Benchmark README.md Type-Safe Plugin Registry Benchmark Overview This coding drill validates the hypothesis that a robust, type-safe plugin architecture can be constructed using Python's standard library `typing.Protocol` for structural subtyping. It...	03-29 08:01	Success	-	View
exp_pytrain.20260308134029.004_20260308_134055 Paper: pytrain.20260308134029.004	Dynamic Plugin System with Structural Subtyping README.md Dynamic Plugin System with Structural Subtyping This benchmark tests the hypothesis that an autonomous coding system can effectively decouple interface definition from implementation by leveraging `typing.Protocol` for structural...	03-29 08:01	Success	-	View
exp_pytrain.20260308134727.005_20260308_134757 Paper: pytrain.20260308134727.005	This document outlines the design and execution of a coding benchmark focused on Strictly-Typed Dependency Graph Resol... README.md This document outlines the design and execution of a coding benchmark focused on Strictly-Typed Dependency Graph Resolution**. Overview The goal of this benchmark is to test the ability of a system to generate a robust, type-saf...	03-29 08:01	Success	-	View
exp_pytrain.20260308135336.006_20260308_135357 Paper: pytrain.20260308135336.006	Dynamic Component Registry with Runtime Type Validation Overview This benchmark evaluates a Python engineer's ability to construct a robust, dynamic plugin architecture using Python's standard library. The task involves generating a temporary package structure on the fly and implementing a regis...	03-29 08:01	Success	-	View
exp_pytrain.20260308140048.007_20260308_140107 Paper: pytrain.20260308140048.007	Generic Component Registry Benchmark This benchmark validates the implementation of a type-safe, generic registry pattern using Python's standard library. The pattern is fundamental in large-scale frameworks (like PyTorch or Lightning) for dynamically managing modules, optimiz...	03-29 08:01	Success	-	View
exp_pytrain.20260308140649.008_20260308_140720 Paper: pytrain.20260308140649.008	Robust Dependency Graph Resolver README.md Robust Dependency Graph Resolver This benchmark validates the implementation of a rigorous, type-safe dependency resolution engine suitable for inclusion in a package manager toolchain. Overview The `benchmark.py` script implement...	03-29 08:01	Success	-	View
exp_pytrain.20260308141314.009_20260308_141338 Paper: pytrain.20260308141314.009	Type-Safe Dynamic Plugin Loader README.md This benchmark evaluates a developer's ability to construct a robust, runtime-extensible plugin system using Python's `typing.Protocol` and `importlib`. Design Brief In an autonomous system, components often need to load third-par...	03-29 08:01	Success	-	View
exp_pytrain.20260308141954.010_20260308_142011 Paper: pytrain.20260308141954.010	Dynamic Plugin Loader with Typing Validation This benchmark simulates a robust plugin architecture by leveraging Python's `typing.Protocol` for structural subtyping. It demonstrates how to dynamically load and validate "packages" (mock objects) at runtime without explicit inheritance,...	03-29 08:01	Success	-	View
exp_pytrain.20260308142635.011_20260308_142718 Paper: pytrain.20260308142635.011	Stdlib ZipApp Builder with AST Type Enforcement Benchmark README.md Stdlib ZipApp Builder with AST Type Enforcement Benchmark This benchmark evaluates the ability to construct a robust build pipeline tool using only the Python standard library. Objective The candidate must implement a tool (`build...	03-29 08:01	Success	-	View
exp_pytrain.20260308143420.012_20260308_143439 Paper: pytrain.20260308143420.012	```markdown bash python benchmark.py	03-29 08:01	Success	-	View
exp_pytrain.20260308144100.013_20260308_144125 Paper: pytrain.20260308144100.013	Python Skill Fallback Title: Protocol-Based Plugin System with Dependency Resolution - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308144712.014_20260308_144733 Paper: pytrain.20260308144712.014	Python Skill Fallback Title: Strict Config Validator & PEP 440 Environment Checker - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308145400.015_20260308_145423 Paper: pytrain.20260308145400.015	Typed Plugin Registry System README.md Typed Plugin Registry System Overview This benchmark demonstrates the implementation of a robust, type-safe plugin system using modern Python type hinting features (PEP 484) and the `typing.Protocol` definition. Design Principles...	03-29 08:01	Success	-	View
exp_pytrain.20260308150033.016_20260308_150059 Paper: pytrain.20260308150033.016	Strictly-Typed Dynamic Plugin Loader README.md Strictly-Typed Dynamic Plugin Loader Overview This benchmark demonstrates an autonomous system capable of utilizing Python's advanced type hinting system to enforce runtime interface compliance while dynamically discovering and lo...	03-29 08:01	Success	-	View
exp_pytrain.20260308150738.017_20260308_150800 Paper: pytrain.20260308150738.017	Protocol-Validated Dynamic Plugin Loader README.md Protocol-Validated Dynamic Plugin Loader This benchmark tests an autonomous coding system's ability to leverage Python's standard library to perform advanced metaprogramming tasks. Hypothesis An autonomous system can programmatica...	03-29 08:01	Success	-	View
exp_pytrain.20260308151327.018_20260308_151343 Paper: pytrain.20260308151327.018	Dynamic Package Loader with Runtime Type Validation README.md Dynamic Package Loader with Runtime Type Validation Objective This benchmark evaluates the ability of a Python system to programmatically generate code, manage the file system, load modules dynamically, and enforce structural subt...	03-29 08:01	Success	-	View
exp_pytrain.20260308152025.019_20260308_152050 Paper: pytrain.20260308152025.019	Strictly Typed Module Registry with Semantic Versioning README.md Strictly Typed Module Registry with Semantic Versioning This benchmark evaluates a candidate's ability to design a robust, zero-dependency plugin architecture within the Python standard library. It focuses on modern typing protoco...	03-29 08:01	Success	-	View
exp_pytrain.20260308153659.020_20260308_153722 Paper: pytrain.20260308153659.020	Benchmark: Strictly-Typed Backend Registry with Dynamic Loading README.md Benchmark: Strictly-Typed Backend Registry with Dynamic Loading Overview This benchmark evaluates a Python system's capability to manage heterogeneous numerical backends using advanced type hinting features (`typing.Protocol`, `ty...	03-29 08:01	Success	-	View
exp_pytrain.20260308154435.021_20260308_154504 Paper: pytrain.20260308154435.021	Strict Package Metadata & Build System Simulator README.md Strict Package Metadata & Build System Simulator Overview This benchmark tests the ability to construct a robust, self-documenting Python packaging utility using advanced standard library typing features. The goal is to enforce da...	03-29 08:01	Success	-	View
exp_pytrain.20260308155125.022_20260308_155147 Paper: pytrain.20260308155125.022	Generic Virtual Package Builder Benchmark README.md This coding drill evaluates your ability to leverage modern Python 3.12+ typing features (PEP 695) and dynamic module introspection to create a robust build utility. Objective: Implement a `PackageBuilder[T]` generic class cap...	03-29 08:01	Success	-	View
exp_pytrain.20260308155845.023_20260308_155913 Paper: pytrain.20260308155845.023	Typed Distribution Simulator Benchmark README.md Typed Distribution Simulator Benchmark This project demonstrates a robust, single-file Python implementation of a local package registry manager (`pkg_simulator`), designed with high-level static typing and packaging standards. Fe...	03-29 08:01	Success	-	View
exp_pytrain.20260308160711.024_20260308_160743 Paper: pytrain.20260308160711.024	Strictly-Typed Event Dispatcher Benchmark README.md This benchmark tests the creation of a strictly-typed Event Dispatcher system using Python's standard library `typing` module. It enforces compile-time type safety using `Protocol` and `Generic`. Prerequisites - Python 3.10+ - `my...	03-29 08:01	Success	-	View
exp_pytrain.20260308162949.001_20260308_163012 Paper: pytrain.20260308162949.001	Strictly Typed Plugin Registry Benchmark README.md Strictly Typed Plugin Registry Benchmark Overview This benchmark demonstrates a robust, self-validating extension system (plugin registry) built with Python's standard library. It leverages `typing.Protocol`, `runtime_checkable`,...	03-29 08:01	Success	-	View
exp_pytrain.20260308163401.001_20260308_163419 Paper: pytrain.20260308163401.001	Typed ZipApp Distribution Benchmark README.md Typed ZipApp Distribution Benchmark Design Brief: This benchmark evaluates an autonomous coding system's ability to programmatically generate, structure, and package a Python application using modern static typing features and...	03-29 08:01	Success	-	View
exp_pytrain.20260308164102.001_20260308_164125 Paper: pytrain.20260308164102.001	Strictly Typed Modular Data Aggregator Overview This benchmark demonstrates the implementation of a strictly typed, modular data processing system using Python's standard library `typing` features. It simulates a professional package structure within a single file, leveraging `P...	03-29 08:01	Success	-	View
exp_pytrain.20260308164451.001_20260308_164513 Paper: pytrain.20260308164451.001	Coding Drill: Typed Plugin System Benchmark README.md Coding Drill: Typed Plugin System Benchmark Objective Design and verify a Python package `processor_pkg` that demonstrates strict adherence to typing standards (using `Protocol` and `TypeVar`) and encapsulation (controlling API ex...	03-29 08:01	Success	-	View
exp_pytrain.20260308164845.001_20260308_164907 Paper: pytrain.20260308164845.001	Benchmark: Strict Package Metadata Validator with Extensible Type Guards README.md Benchmark: Strict Package Metadata Validator with Extensible Type Guards Overview This benchmark implements a robust, runtime type-safe validator for Python package metadata, simulating structures found in `pyproject.toml`. It dem...	03-29 08:01	Success	-	View
exp_pytrain.20260308165654.001_20260308_165717 Paper: pytrain.20260308165654.001	Python Skill Fallback Title: Dynamic Package Construction and Strict Protocol Verification - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_pytrain.20260308170418.002_20260308_170452 Paper: pytrain.20260308170418.002	--- Generic Data Pipeline Refactoring using PEP 695 Design Brief Hypothesis: Adopting Python 3.12's PEP 695 Type Parameter Syntax enhances the clarity and maintainability of generic algorithms by reducing boilerplate and scoping type variab...	03-29 08:01	Success	-	View
exp_pytrain.20260308171044.003_20260308_171105 Paper: pytrain.20260308171044.003	Dynamic Plugin Loader with Strict Type Verification This benchmark demonstrates a robust plugin architecture where Python code is loaded at runtime from a string, injected into `sys.path`, and rigorously validated against a `typing.Protocol`. This ensures that third-party or user-defined cod...	03-29 08:01	Success	-	View
exp_pytrain.20260308171406.001_20260308_171433 Paper: pytrain.20260308171406.001	Strictly Typed Dependency Resolution Simulation README.md Strictly Typed Dependency Resolution Simulation Overview This benchmark tests the ability to design a robust, lightweight package manager simulation using advanced Python typing constructs. The core hypothesis is that strict typin...	03-29 08:01	Success	-	View
exp_pytrain.20260308172046.002_20260308_172110 Paper: pytrain.20260308172046.002	Python Reliability Drill: Typing & Generics README.md Python Reliability Drill: Typing & Generics This drill benchmarks your ability to implement robust, type-safe utilities using modern Python type systems (PEP 695) without external dependencies. Objective Implement a generic contai...	03-29 08:01	Success	-	View
exp_pytrain.20260308172740.003_20260308_172803 Paper: pytrain.20260308172740.003	Type-Verified Zip Application Packager README.md Type-Verified Zip Application Packager This benchmark is designed to test the implementation of a robust, type-safe Python application packager. Overview The script defines a packaging pipeline that enforces strict typing on appli...	03-29 08:01	Success	-	View
exp_pytrain.20260308173339.004_20260308_173418 Paper: pytrain.20260308173339.004	```markdown README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_pytrain.20260308174702.005_20260308_174721 Paper: pytrain.20260308174702.005	Dynamic Plugin Loader with Protocol Validation This benchmark demonstrates a robust, type-safe plugin architecture using Python's standard library. Overview The `PluginManager` class in this benchmark: 1. Uses `tempfile` to dynamically construct a valid Python package directory structur...	03-29 08:01	Success	-	View
exp_pytrain.20260308175256.006_20260308_175321 Paper: pytrain.20260308175256.006	Python Coding Drill: Lazy-Loaded Module Simulation README.md Python Coding Drill: Lazy-Loaded Module Simulation Objective This benchmark challenges the developer to architect a simulation of a high-performance library's internal structure (similar to `vllm` or `diffusers`). The task involve...	03-29 08:01	Success	-	View
exp_pytrain.20260308180011.007_20260308_180041 Paper: pytrain.20260308180011.007	Python Skill Fallback Title: Strictly Typed Plugin Registry with Dynamic Discovery - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-29 08:01	Success	-	View
exp_self.20260307063408.001_20260307_063436 Paper: self.20260307063408.001	Adaptive Precision Hierarchical Distillation Benchmark README.md Adaptive Precision Hierarchical Distillation Benchmark This repository evaluates the Adaptive Precision Hierarchical Distillation methodology. It tests the hypothesis that a student model utilizing hierarchical attention and d...	03-29 08:01	Success	-	View
exp_self.20260307063657.001_20260307_063731 Paper: self.20260307063657.001	Here is the runnable benchmark code for the Dynamic Precision Hierarchical Distillation innovation. README.md Dynamic Precision Hierarchical Distillation Benchmark This repository contains a minimal, runnable benchmark for the "Dynamic Precision Hierarchical Distillation with Selective Memory Caching" innovation. Innovation Summary This b...	03-29 08:01	Pending	-	View
exp_self.20260307064659.001_20260307_064737 Paper: self.20260307064659.001	Here is the design for the benchmark. This setup uses PyTorch to simulate the workload of a Transformer-based model, com... README.md bash pip install torch bash python benchmark.py	03-29 08:01	Pending	-	View
exp_self.20260307170335.003_20260307_170400 Paper: self.20260307170335.003	Benchmark: Dynamic-Precision State Caching for Mamba SSMs README.md Benchmark: Dynamic-Precision State Caching for Mamba SSMs Overview This benchmark validates the hypothesis that utilizing dynamic precision (bfloat16) for the recurrent hidden states of Mamba (SSM) models can reduce VRAM press...	03-29 08:01	Success	-	View
exp_self.20260307170553.004_20260307_170622 Paper: self.20260307170553.004	Memory-Efficient Distillation of Mamba SSMs with Dynamic Precision Caching README.md This benchmark evaluates a novel training approach for State Space Models (SSMs), specifically focusing on a Mamba-based student model distilled from a Transformer teacher. The core innovation lies in the integration of **layer-wi...	03-29 08:01	Success	-	View
exp_self.20260307171222.005_20260307_171449 Paper: self.20260307171222.005	Here is the design and implementation for the Dynamic Precision Caching for Low-Memory SSM Distillation benchmark. README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307171956.006_20260307_172024 Paper: self.20260307171956.006	Section 1: README.md Dynamic Precision State Caching for Memory-Efficient Mamba Distillation Overview This benchmark validates the hypothesis that distilling a Transformer teacher into a Mamba-like SSM student can be run on memory-constrained GPUs (8GB) by util...	03-29 08:01	Success	-	View
exp_self.20260307172232.007_20260307_172325 Paper: self.20260307172232.007	--- README.md --- Self-Directed Benchmark: SSM Strategy Stress Test Innovation Summary This benchmark validates the hypothesis that State Space Model (SSM) inference strategies, which utilize fixed-size recurrent state buffers rather than g...	03-29 08:01	Success	-	View
exp_self.20260307172511.008_20260307_172539 Paper: self.20260307172511.008	Here are the two files as requested. README.md Dynamic Precision State Caching for Memory-Efficient SSM Distillation Overview This benchmark evaluates an "Innovation" technique designed to optimize the training of State Space Models (SSMs) on hardware-constrained devices (e.g....	03-29 08:01	Success	-	View
exp_self.20260307172903.009_20260307_172930 Paper: self.20260307172903.009	Dynamic Precision State Caching for Memory-Efficient SSM Distillation README.md Dynamic Precision State Caching for Memory-Efficient SSM Distillation This benchmark evaluates a novel approach to training State Space Models (SSMs), specifically focusing on the Mamba architecture, under strict memory constraint...	03-29 08:01	Success	-	View
exp_self.20260307173243.010_20260307_173313 Paper: self.20260307173243.010	Benchmark: Memory-Efficient SSM Distillation via Dynamic State Precision README.md Benchmark: Memory-Efficient SSM Distillation via Dynamic State Precision Overview This benchmark evaluates a hypothesis regarding State Space Models (SSMs): that explicitly enforcing lower precision (FP16) on the recurrent hidden...	03-29 08:01	Success	-	View
exp_self.20260307174235.012_20260307_174802 Paper: self.20260307174235.012	Benchmark: Adaptive Layer-wise State Precision for SSMs README.md Benchmark: Adaptive Layer-wise State Precision for SSMs Overview This benchmark evaluates the efficiency gains of applying Adaptive Layer-wise State Precision to State Space Models (SSMs). In the context of SSM distillation, s...	03-29 08:01	Success	-	View
exp_self.20260307180747.014_20260307_180816 Paper: self.20260307180747.014	Memory-Efficient SSM Distillation Benchmark README.md Memory-Efficient SSM Distillation Benchmark This benchmark validates the hypothesis that a State Space Model (Student) can effectively distill knowledge from a larger Transformer (Teacher) while strictly adhering to an 8GB VRAM bu...	03-29 08:01	Success	-	View
exp_self.20260307180936.015_20260307_181011 Paper: self.20260307180936.015	Efficient SSM Distillation Benchmark README.md Efficient SSM Distillation Benchmark This benchmark implements a teacher-student distillation setup where a GPT-2 model (Teacher) transfers knowledge to a lightweight Mamba-style State Space Model (Student). Key Features 1. **Cust...	03-29 08:01	Success	-	View
exp_self.20260307181354.016_20260307_181648 Paper: self.20260307181354.016	Benchmark: Efficient SSM Distillation with Dynamic Precision and State Caching README.md Benchmark: Efficient SSM Distillation with Dynamic Precision and State Caching This benchmark evaluates the performance gains of a hypothetical Student State Space Model (SSM) against a baseline Teacher model. The innovation focus...	03-29 08:01	Success	-	View
exp_self.20260307181801.017_20260307_181840 Paper: self.20260307181801.017	Here is the runnable benchmark for the SSM Distillation with Dynamic Precision and Memory-Cache Optimization. README.md SSM Distillation with Dynamic Precision and Memory-Cache Optimization This repository contains a benchmark designed to test the hypothesis that integrating Dynamic Precision training into the Knowledge Distillation of a **...	03-29 08:01	Success	-	View
exp_self.20260307182132.018_20260307_182212 Paper: self.20260307182132.018	Efficient SSM Distillation Benchmark README.md Efficient SSM Distillation Benchmark ===================================== This benchmark evaluates the performance of a Knowledge Distillation pipeline where a Transformer-based teacher model trains a simplified Mamba-like Select...	03-29 08:01	Success	-	View
exp_self.20260307182441.019_20260307_182547 Paper: self.20260307182441.019	Memory-Efficient SSM Distillation via Dynamic State Caching README.md --- Memory-Efficient SSM Distillation Benchmark Overview This benchmark evaluates the hypothesis that applying dynamic precision (specifically FP16) to the recurrent state cache of a Student State Space Model (SSM) during know...	03-29 08:01	Success	-	View
exp_self.20260307184318.021_20260307_184424 Paper: self.20260307184318.021	Benchmark: Dynamic Precision Recurrent State Caching for SSMs README.md Benchmark: Dynamic Precision Recurrent State Caching for SSMs Overview This benchmark evaluates the memory efficiency of a Dynamic Precision Recurrent State Caching mechanism designed for State Space Models (SSMs) during the d...	03-29 08:01	Success	-	View
exp_self.20260307184654.022_20260307_185114 Paper: self.20260307184654.022	Zero-Shot SSM Distillation Benchmark README.md Zero-Shot SSM Distillation Benchmark This benchmark evaluates the performance characteristics of the Zero-Shot SSM Distillation technique. The innovation focuses on two primary efficiency mechanisms: 1. Adaptive Precision:...	03-29 08:01	Success	-	View
exp_self.20260307185210.023_20260307_185249 Paper: self.20260307185210.023	Here are the two sections of the runnable benchmark, designed to demonstrate Adaptive-Precision SSM Distillation with Re... bash pip install torch transformers tqdm python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307190419.024_20260307_190458 Paper: self.20260307190419.024	Benchmark: Low-Memory SSM Distillation via Cached State Quantization README.md Benchmark: Low-Memory SSM Distillation via Cached State Quantization This benchmark evaluates the hypothesis that applying dynamic precision quantization to the recurrent state cache of a State Space Model (SSM) student can signif...	03-29 08:01	Success	-	View
exp_self.20260307190718.025_20260307_190752 Paper: self.20260307190718.025	Benchmark: Dynamic Precision SSM Distillation README.md Benchmark: Dynamic Precision SSM Distillation This benchmark evaluates the hypothesis that applying dynamic precision reduction to the recurrent state cache of a distilled State Space Model (SSM) can significantly reduce peak VRAM...	03-29 08:01	Success	-	View
exp_self.20260307191059.026_20260307_191143 Paper: self.20260307191059.026	```markdown bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307191408.027_20260307_191450 Paper: self.20260307191408.027	You are an ML engineer creating a safe, runnable benchmarking code. Design a small, runnable benchmark for this innovation. STRICT REQUIREMENT: Output two sections separated by '	03-29 08:01	Success	-	View
exp_self.20260307191701.028_20260307_191730 Paper: self.20260307191701.028	Memory-Efficient SSM Distillation Benchmark README.md Memory-Efficient SSM Distillation Benchmark This repository contains a minimal, runnable benchmark designed to evaluate the hypothesis that Dynamic Precision State Caching enables the training of State Space Models (SSMs) via...	03-29 08:01	Success	-	View
exp_self.20260307191944.029_20260307_192257 Paper: self.20260307191944.029	Innovation: Fine-Grained Dynamic Precision in SSM State Caching README.md Innovation: Fine-Grained Dynamic Precision in SSM State Caching This benchmark validates the efficiency gains of applying dynamic precision reduction (FP32 -> FP16/BF16) specifically to the recurrent state cache of a State Space M...	03-29 08:01	Success	-	View
exp_self.20260307192342.030_20260307_192415 Paper: self.20260307192342.030	Cache-Aware Dynamic Precision Distillation for Memory-Constrained SSMs README.md Cache-Aware Dynamic Precision Distillation for Memory-Constrained SSMs Overview This benchmark evaluates an innovation aimed at running large State Space Models (SSMs) on memory-constrained hardware (8GB VRAM). The core hypothesis...	03-29 08:01	Success	-	View
exp_self.20260307192642.031_20260307_192725 Paper: self.20260307192642.031	Dynamic Precision State-Cache Distillation for Low-Resource SSMs README.md Dynamic Precision State-Cache Distillation for Low-Resource SSMs Overview This benchmark evaluates a hypothesis for optimizing State Space Models (SSMs) on memory-constrained hardware (e.g., 8GB GPUs). The innovation introduces a...	03-29 08:01	Success	-	View
exp_self.20260307192857.032_20260307_192928 Paper: self.20260307192857.032	Here is the runnable benchmark designed for the "Dynamic-Precision State Distillation" innovation. README.md Dynamic-Precision State Distillation for Low-Resource SSMs Overview This benchmark tests the hypothesis that applying dynamic precision (FP16) to the state cache of a student SSM (distilled from a larger teacher) significantly red...	03-29 08:01	Success	-	View
exp_self.20260307193200.033_20260307_193235 Paper: self.20260307193200.033	This repository contains a benchmark for "State-Quantized Distillation for Low-Latency SSMs." README.md This repository contains a benchmark for "State-Quantized Distillation for Low-Latency SSMs." Overview This benchmark tests the hypothesis that dynamically quantizing the recurrent state cache of a State Space Model (SSM) from FP3...	03-29 08:01	Success	-	View
exp_self.20260307193600.034_20260307_193630 Paper: self.20260307193600.034	```markdown bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307193840.035_20260307_193914 Paper: self.20260307193840.035	Dynamic-Precision State-Cache Distillation Benchmark README.md Dynamic-Precision State-Cache Distillation Benchmark This repository contains a minimal, self-contained benchmark to validate the memory efficiency of Dynamic-Precision State-Cache Distillation for State-Space Models (SSMs). H...	03-29 08:01	Success	-	View
exp_self.20260307194432.037_20260307_194641 Paper: self.20260307194432.037	Benchmark: Dynamic-Precision State-Cache Distillation for SSMs README.md Benchmark: Dynamic-Precision State-Cache Distillation for SSMs This benchmark evaluates the memory efficiency and inference throughput of a novel State Space Model (SSM) approach. The proposed innovation ("Dynamic-Precision State-...	03-29 08:01	Success	-	View
exp_self.20260307194829.038_20260307_194915 Paper: self.20260307194829.038	Dynamic-Precision State-Cache Distillation Benchmark README.md Dynamic-Precision State-Cache Distillation Benchmark Overview This benchmark tests the hypothesis that a State Space Model (SSM) using Dynamic Precision for recurrent state tensors and Gradient Checkpointing for state cach...	03-29 08:01	Success	-	View
exp_self.20260307195135.039_20260307_195212 Paper: self.20260307195135.039	Adaptive State Distillation for Low-Memory SSMs README.md Adaptive State Distillation for Low-Memory SSMs Overview This benchmark validates the hypothesis that a Student SSM utilizing dynamic precision (FP16 state caching) can maintain throughput comparable to a standard **Teache...	03-29 08:01	Success	-	View
exp_self.20260307195425.040_20260307_195501 Paper: self.20260307195425.040	Dynamic State-Cache Distillation for Low-Memory SSMs README.md Dynamic State-Cache Distillation for Low-Memory SSMs Innovation Overview This benchmark demonstrates a novel approach to optimizing State Space Models (SSMs), specifically the Mamba architecture, for edge-constrained environments....	03-29 08:01	Success	-	View
exp_self.20260307195724.041_20260307_195807 Paper: self.20260307195724.041	Dynamic-Precision State-Cache Distillation Benchmark README.md Dynamic-Precision State-Cache Distillation Benchmark This repository contains a minimal, self-contained benchmark designed to validate the "Dynamic-Precision State-Cache Distillation" hypothesis for State Space Models (SSMs). Hypo...	03-29 08:01	Success	-	View
exp_self.20260307200025.042_20260307_200232 Paper: self.20260307200025.042	Benchmark: Dynamic-Precision State-Cache for SSMs README.md Benchmark: Dynamic-Precision State-Cache for SSMs This benchmark evaluates the "Dynamic-Precision State-Cache Distillation" concept for State Space Models (SSMs). Since the original architecture generation was skipped, this benchm...	03-29 08:01	Success	-	View
exp_self.20260307200347.043_20260307_200704 Paper: self.20260307200347.043	Based on the provided abstract and innovation title, here is a runnable benchmark design. Since the abstract mentions th... The benchmark compares a standard full-precision SSM (Baseline) against an SSM utilizing dynamic precision for its state cache (Innovation). --- README.md Benchmark: Dynamic State-Cache Distillation for Low-Memory SSMs Overview This benchma...	03-29 08:01	Success	-	View
exp_self.20260307200745.044_20260307_200825 Paper: self.20260307200745.044	Dynamic Precision State Caching for SSMs via Logit Distillation README.md Dynamic Precision State Caching for SSMs via Logit Distillation Overview This benchmark implements a minimal Selective State Space Model (Mamba-style) to test the hypothesis that storing recurrent state tensors in dynamic precisio...	03-29 08:01	Success	-	View
exp_self.20260307201045.045_20260307_201409 Paper: self.20260307201045.045	```markdown bash pip install torch bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307201445.046_20260307_201519 Paper: self.20260307201445.046	Adaptive Precision State Caching for Mamba SSMs README.md Adaptive Precision State Caching for Mamba SSMs Overview This benchmark evaluates an Adaptive Precision State Caching mechanism designed for Mamba-style State Space Models (SSMs). The core hypothesis is that by storing recurre...	03-29 08:01	Success	-	View
exp_self.20260307202107.001_20260307_202143 Paper: self.20260307202107.001	Adaptive Precision State Cache for Mamba SSMs README.md Adaptive Precision State Cache for Mamba SSMs Overview This benchmark validates the "Adaptive Precision State Cache" hypothesis. It demonstrates that dynamically quantizing the recurrent state cache of a Mamba Selective State Spac...	03-29 08:01	Success	-	View
exp_self.20260307202349.002_20260307_202420 Paper: self.20260307202349.002	Memory-Constrained Dynamic Precision Caching for Mamba SSMs This benchmark evaluates a hypothesis regarding dynamic precision in State Space Models (specifically a Mamba-style architecture). Hypothesis: By storing the recurrent hidden state in half-precision (FP16) while maintaining the immediat...	03-29 08:01	Success	-	View
exp_self.20260307202731.003_20260307_202810 Paper: self.20260307202731.003	Dynamic Precision State Cache for Efficient Mamba Inference README.md Dynamic Precision State Cache for Efficient Mamba Inference Overview This benchmark implements and tests a novel memory optimization for State Space Models (specifically Mamba architectures). The core innovation involves applying...	03-29 08:01	Success	-	View
exp_self.20260307203029.004_20260307_203109 Paper: self.20260307203029.004	```markdown README.md bash pip install torch numpy python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307203435.005_20260307_203518 Paper: self.20260307203435.005	--- README.md --- Benchmark: Dynamic-Precision State Caching for SSMs This repository contains a minimal, runnable benchmark designed to validate the hypothesis regarding memory-efficient State Space Models (SSMs). Hypothesis Employing dynamic...	03-29 08:01	Success	-	View
exp_self.20260307203739.006_20260307_203812 Paper: self.20260307203739.006	```markdown README.md bash pip install torch tqdm bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307204142.007_20260307_204219 Paper: self.20260307204142.007	Dynamic-Precision State Distillation Benchmark README.md Dynamic-Precision State Distillation Benchmark This benchmark evaluates Dynamic-Precision State Distillation, a technique to optimize State Space Models (SSMs) like Mamba. The Innovation Standard SSMs maintain high-precision (...	03-29 08:01	Success	-	View
exp_self.20260307205408.008_20260307_205452 Paper: self.20260307205408.008	Here is the runnable benchmark code. README.md Dynamic-Precision State Distillation Benchmark This benchmark validates the hypothesis that dynamic precision switching combined with knowledge distillation reduces VRAM usage for SSM training without sacrificing perplexity. Metho...	03-29 08:01	Success	-	View
exp_self.20260307205730.009_20260307_205821 Paper: self.20260307205730.009	Here is the design for the "Mixed-Precision State Distillation for Low-Resource SSMs" benchmark. No summary available yet.	03-29 08:01	Success	-	View
exp_self.20260307210026.010_20260307_210111 Paper: self.20260307210026.010	Adaptive State Space Distillation with Dynamic Precision Caching README.md Adaptive State Space Distillation with Dynamic Precision Caching Overview This repository contains a benchmark implementation for Adaptive State Space Distillation with Dynamic Precision Caching. The core innovation combines *...	03-29 08:01	Success	-	View
exp_self.20260307210324.011_20260307_210618 Paper: self.20260307210324.011	Benchmark: Memory-Adaptive SSM Distillation via Dynamic Precision Caching README.md Benchmark: Memory-Adaptive SSM Distillation via Dynamic Precision Caching This benchmark evaluates a simulated implementation of a State Space Model (SSM) that utilizes dynamic precision switching to optimize memory bandwidth...	03-29 08:01	Success	-	View
exp_self.20260307210718.012_20260307_210753 Paper: self.20260307210718.012	Dynamic Precision SSM Distillation Benchmark README.md Dynamic Precision SSM Distillation Benchmark This repository contains the benchmark code for evaluating Dynamic Precision SSM Distillation with State Memory Caching. Overview This benchmark tests the hypothesis that using Auto...	03-29 08:01	Success	-	View
exp_self.20260307212455.013_20260307_212537 Paper: self.20260307212455.013	Here is the design for the benchmark. README.md bash pip install torch numpy bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307212753.014_20260307_213143 Paper: self.20260307212753.014	Benchmark: Dynamic Precision SSM with State Caching & Memory Distillation README.md Benchmark: Dynamic Precision SSM with State Caching & Memory Distillation Overview This benchmark evaluates a synthetic State Space Model (SSM) implementation designed to test the efficiency gains of three key architectural innova...	03-29 08:01	Success	-	View
exp_self.20260307213254.015_20260307_213338 Paper: self.20260307213254.015	This benchmark evaluates the "Dynamic Precision SSM Distillation with State Memory Caching" innovation. README.md This benchmark evaluates the "Dynamic Precision SSM Distillation with State Memory Caching" innovation. Hypothesis By distilling a lightweight State Space Model (SSM) from a larger Transformer teacher and utilizing Dynamic Precisi...	03-29 08:01	Success	-	View
exp_self.20260307213618.016_20260307_213647 Paper: self.20260307213618.016	This benchmark evaluates a synthetic implementation of a Dynamic Precision State Space Model (SSM). The goal is to valid... README.md This benchmark evaluates a synthetic implementation of a Dynamic Precision State Space Model (SSM). The goal is to validate the hypothesis that utilizing reduced precision (FP16) for the recurrent state tensors during inference—si...	03-29 08:01	Success	-	View
exp_self.20260307213856.017_20260307_213942 Paper: self.20260307213856.017	This repository contains the benchmarking suite for the "Dynamic Precision SSM with Cached State Distillation" project. README.md This repository contains the benchmarking suite for the "Dynamic Precision SSM with Cached State Distillation" project. Objective To validate the hypothesis that a State Space Model (SSM) utilizing Dynamic Precision (AMP), State C...	03-29 08:01	Success	-	View
exp_self.20260307214200.018_20260307_214535 Paper: self.20260307214200.018	Here is the runnable benchmark for the innovation described in the title "Memory-Efficient Mamba Distillation via Activa... bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307214620.019_20260307_214713 Paper: self.20260307214620.019	8GB-Optimized SSM Distillation Benchmark README.md 8GB-Optimized SSM Distillation Benchmark This benchmark validates the 8GB-Optimized SSM Distillation innovation. Hypothesis By offloading Teacher Logit computation to a CPU cache and utilizing Dynamic Precision (AMP), we can t...	03-29 08:01	Success	-	View
exp_self.20260307214829.020_20260307_214913 Paper: self.20260307214829.020	Dynamic Precision SSM Distillation with CPU-State Offloading Benchmark README.md Dynamic Precision SSM Distillation with CPU-State Offloading Benchmark This benchmark validates the hypothesis that a State Space Model (SSM) student can be effectively distilled from a Transformer teacher on memory-constrained ha...	03-29 08:01	Success	-	View
exp_self.20260307215208.021_20260307_215236 Paper: self.20260307215208.021	CPU-Offloaded Dynamic Precision SSM Distillation README.md CPU-Offloaded Dynamic Precision SSM Distillation This benchmark demonstrates a novel training optimization for State Space Models (SSMs), specifically targeting scenarios where GPU VRAM is constrained (e.g., 8GB cards). Innovation...	03-29 08:01	Success	-	View
exp_self.20260307215527.022_20260307_215834 Paper: self.20260307215527.022	Here is the runnable benchmark for the "Hybrid-Precision State-Checkpointing for SSM Distillation" innovation. README.md	03-29 08:01	Success	-	View
exp_self.20260307215935.023_20260307_220013 Paper: self.20260307215935.023	Dynamic-Precision SSM Distillation Benchmark README.md Dynamic-Precision SSM Distillation Benchmark This benchmark validates the hypothesis that a combination of System RAM Caching, Gradient Checkpointing, and Dynamic Precision (AMP) can enable the distillation of a large...	03-29 08:01	Success	-	View
exp_self.20260307221231.024_20260307_221254 Paper: self.20260307221231.024	Dynamic-Precision SSM Distillation Benchmark README.md Dynamic-Precision SSM Distillation Benchmark Overview This benchmark demonstrates a novel training optimization technique designed to fit Large Language Model (LLM) distillation into strict hardware constraints (specifically 8GB V...	03-29 08:01	Success	-	View
exp_self.20260307221456.025_20260307_221705 Paper: self.20260307221456.025	Here is the design for the Backfill Candidate benchmark. Since the abstract indicates the original output was empty ("architect_output_empty"), this implementation realizes the intent described in the title: Dynamic-Precision SSM Distillation with Gradient-Gated State Caching. We define a l...	03-29 08:01	Success	-	View
exp_self.20260307221924.026_20260307_222120 Paper: self.20260307221924.026	Here is a runnable benchmark for the "Dynamic-Precision SSM with Recurrent State Caching" innovation, designed to profil... README.md --- Dynamic-Precision SSM Benchmark This benchmark evaluates the performance characteristics of a Dynamic-Precision State Space Model (SSM) utilizing Recurrent State Caching. Innovation Summary Traditional SSMs (like S4 or...	03-29 08:01	Success	-	View
exp_self.20260307222152.027_20260307_222225 Paper: self.20260307222152.027	Section 1: README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307222454.028_20260307_222708 Paper: self.20260307222454.028	Mixed-Precision SSM State Caching Benchmark README.md Mixed-Precision SSM State Caching Benchmark This benchmark implements a lightweight, runnable simulation of a State Space Model (SSM) with State Caching and Mixed-Precision optimization. It is designed to verify the ef...	03-29 08:01	Success	-	View
exp_self.20260307222754.029_20260307_222823 Paper: self.20260307222754.029	Benchmark: Dynamic-Precision SSM Distillation with State Caching README.md Benchmark: Dynamic-Precision SSM Distillation with State Caching This benchmark evaluates a hardware-efficient training and inference pipeline for State Space Models (SSMs). Hypothesis By distilling a large Transformer into a smal...	03-29 08:01	Success	-	View
exp_self.20260307223147.030_20260307_223229 Paper: self.20260307223147.030	--- README.md Layer-Wise Dynamic-Precision SSM Distillation Benchmark This repository contains a minimal, runnable benchmark for Layer-Wise Dynamic-Precision SSM Distillation with Activation Caching. Innovation Summary This benchmark demons...	03-29 08:01	Success	-	View
exp_self.20260307223442.031_20260307_223523 Paper: self.20260307223442.031	This repository contains the benchmark implementation for Dynamic-Precision SSM Distillation with Gradient-Sensitive S... README.md This repository contains the benchmark implementation for Dynamic-Precision SSM Distillation with Gradient-Sensitive State Caching**. Overview This benchmark validates the hypothesis that dynamically switching between FP16 and F...	03-29 08:01	Success	-	View
exp_self.20260307223735.032_20260307_223811 Paper: self.20260307223735.032	Memory-Efficient SSM Distillation Benchmark README.md Memory-Efficient SSM Distillation Benchmark This benchmark validates the hypothesis that Segment State Caching combined with Dynamic Precision can significantly reduce the memory footprint of training a State Space Model (...	03-29 08:01	Success	-	View
exp_self.20260307224116.033_20260307_224233 Paper: self.20260307224116.033	Memory-Efficient SSM Distillation Benchmark By monitoring the gradient magnitude of the SSM hidden state during the backward pass, we can dynamically downshift the state cache precision (BF16 vs FP32), reducing VRAM usage by >15% while maintaining model accuracy.	03-29 08:01	Success	-	View
exp_self.20260307224448.034_20260307_224526 Paper: self.20260307224448.034	Cache-Aware Dynamic Precision Distillation for SSMs README.md Cache-Aware Dynamic Precision Distillation for SSMs This repository contains the benchmark implementation for Cache-Aware Dynamic Precision Distillation. This innovation targets State Space Models (SSMs) to reduce memory footp...	03-29 08:01	Success	-	View
exp_self.20260307224731.035_20260307_224803 Paper: self.20260307224731.035	Memory-Bounded SSM Distillation Benchmark This benchmark evaluates a novel Segmented State Caching mechanism with Dynamic Precision for training State Space Models (SSMs) under strict memory constraints. Innovation Summary Standard SSM training (e.g., Mamba architectures) r...	03-29 08:01	Success	-	View
exp_self.20260307225949.036_20260307_230023 Paper: self.20260307225949.036	Explanation of the Design The benchmark is designed to validate the "Dynamic-Precision SSM Distillation" hypothesis. 1. Synthetic SSM Model: Instead of relying on external `mamba-ssm` libraries which may be hard to install/benchmark in a standalone script, I imp...	03-29 08:01	Success	-	View
exp_self.20260307230248.037_20260307_230311 Paper: self.20260307230248.037	Here is the design for the benchmark, split into the README and the runnable Python script as requested. This benchmark implements a synthetic SSM (State Space Model) distillation pipeline. It compares a full-precision Teacher model against a Student model that utilizes Selective State Caching (forcing recurrent states to `bfloat16`) and *...	03-29 08:01	Success	-	View
exp_self.20260307230626.038_20260307_230651 Paper: self.20260307230626.038	Memory-Efficient SSM Distillation Benchmark README.md Memory-Efficient SSM Distillation Benchmark Overview This benchmark evaluates a hypothesis for training Selective State Space Models (SSMs) on constrained hardware (8GB GPU). It tests a distillation setup where a smaller Student S...	03-29 08:01	Success	-	View
exp_self.20260307230930.039_20260307_230959 Paper: self.20260307230930.039	Innovation Benchmark: Quantized State Caching for Low-Resource SSM Distillation README.md Innovation Benchmark: Quantized State Caching for Low-Resource SSM Distillation Overview This benchmark evaluates the "Quantized State Caching" hypothesis. It demonstrates that by applying dynamic precision (FP16/FP8) specifically...	03-29 08:01	Success	-	View
exp_self.20260307231325.040_20260307_231354 Paper: self.20260307231325.040	This benchmark evaluates Dynamic Precision State Caching for Selective State Space Models (SSMs). README.md This benchmark evaluates Dynamic Precision State Caching for Selective State Space Models (SSMs). Innovation The core hypothesis is that SSMs do not require full float32 precision for their recurrent hidden states at all times...	03-29 08:01	Success	-	View
exp_self.20260307231612.041_20260307_231647 Paper: self.20260307231612.041	The user wants a benchmark for "Adaptive Precision State Caching". I will implement a synthetic benchmark where: 1. A Teacher Transformer (FP32) processes a sequence. 2. A Student SSM processes the same sequence, guided by the teacher. 3. The SSM uses a `DynamicPrecisionCache` that stores recurrent states...	03-29 08:01	Success	-	View
exp_self.20260307231757.042_20260307_231833 Paper: self.20260307231757.042	This repository contains a runnable benchmark designed to evaluate the memory efficiency of a Dynamic Precision State Ca... README.md This repository contains a runnable benchmark designed to evaluate the memory efficiency of a Dynamic Precision State Caching mechanism for State Space Models (SSMs) during Knowledge Distillation. Innovation: Dynamic Precision Sta...	03-29 08:01	Success	-	View
exp_self.20260307232016.043_20260307_232210 Paper: self.20260307232016.043	Benchmark for Self-Regulated State Cache Precision Overview This benchmark is designed to validate the Self-Regulated State Cache Precision concept for State Space Models (SSMs). Since the architectural definition was previously empty, this implementation reconstructs the core hypothesi...	03-29 08:01	Success	-	View
exp_self.20260307232257.044_20260307_232325 Paper: self.20260307232257.044	Self-Regulated Quantized State Caching for SSM Distillation README.md Self-Regulated Quantized State Caching for SSM Distillation This benchmark evaluates a novel approach to memory-efficient State Space Model (SSM) training via Knowledge Distillation. The core innovation is a **Self-Regulated State...	03-29 08:01	Success	-	View
exp_self.20260307232836.046_20260307_233042 Paper: self.20260307232836.046	The following benchmark is designed to evaluate the efficiency of a Dynamic Precision State Caching mechanism for State... bash python benchmark.py ```	03-29 08:01	Success	-	View
exp_self.20260307233233.047_20260307_233256 Paper: self.20260307233233.047	Cache-Augmented SSM Distillation with Dynamic Precision State Management This repository contains the benchmark implementation for testing memory-efficient inference using a distilled State Space Model (SSM) augmented with a dynamic precision state cache. Overview The benchmark tests the hypothesis that a Studen...	03-29 08:01	Success	-	View
exp_self.20260307233516.048_20260307_233546 Paper: self.20260307233516.048	Section 1: README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307234722.049_20260307_234750 Paper: self.20260307234722.049	Cache-Compressed Hybrid SSM Distillation Benchmark README.md Cache-Compressed Hybrid SSM Distillation Benchmark This benchmark evaluates a novel architecture designed to maximize context window handling and memory efficiency on consumer-grade hardware (8GB VRAM target). The Innovation: Hybr...	03-29 08:01	Success	-	View
exp_self.20260307234959.050_20260307_235024 Paper: self.20260307234959.050	This benchmark implements a proof-of-concept for Dynamic-Precision SSM Distillation. It validates the hypothesis tha... README.md This benchmark implements a proof-of-concept for Dynamic-Precision SSM Distillation. It validates the hypothesis that selectively reducing the precision of recurrent state tensors within a Selective State Space Model (SSM) stu...	03-29 08:01	Success	-	View
exp_self.20260307235331.051_20260307_235356 Paper: self.20260307235331.051	Benchmark Design: State-Aware Dynamic Precision SSM README.md bash pip install torch numpy bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260307235604.052_20260307_235630 Paper: self.20260307235604.052	Efficient SSM Distillation via Adaptive State Cache Precision README.md Efficient SSM Distillation via Adaptive State Cache Precision Overview This benchmark evaluates the hypothesis that applying dynamic precision scaling (FP16/INT8) specifically to the recurrent state cache of a Student SSM during k...	03-29 08:01	Success	-	View
exp_self.20260308000001.053_20260308_000033 Paper: self.20260308000001.053	Efficient Long-Context SSM Distillation via Dynamic State Caching This repository contains the benchmark implementation for testing hybrid memory architectures on State Space Models (SSMs). It demonstrates how moving long-term SSM hidden states to low-precision system RAM allows for effective distillation...	03-29 08:01	Success	-	View
exp_self.20260308000612.054_20260308_000643 Paper: self.20260308000612.054	Adaptive State Precision for Memory-Efficient SSM Distillation README.md Adaptive State Precision for Memory-Efficient SSM Distillation Overview This benchmark evaluates the hypothesis that applying dynamic precision techniques to the recurrent state cache (hidden states) of a State Space Model (SSM) d...	03-29 08:01	Success	-	View
exp_self.20260308000849.055_20260308_000914 Paper: self.20260308000849.055	Benchmark: Cache-Aware Dynamic Precision for Efficient SSM Distillation README.md Benchmark: Cache-Aware Dynamic Precision for Efficient SSM Distillation Overview This benchmark evaluates the hypothesis that applying dynamic precision reduction to the recurrent state cache of a State Space Model (SSM) during kn...	03-29 08:01	Success	-	View
exp_self.20260308001231.056_20260308_001433 Paper: self.20260308001231.056	```markdown README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308001506.057_20260308_001541 Paper: self.20260308001506.057	Benchmark: Cache-Aware Dynamic State Precision for SSM Distillation README.md Benchmark: Cache-Aware Dynamic State Precision for SSM Distillation This benchmark evaluates the hypothesis that applying dynamic precision quantization specifically to the recurrent state memory (cache) of a Student SSM during di...	03-29 08:01	Success	-	View
exp_self.20260308001901.058_20260308_001932 Paper: self.20260308001901.058	Dynamic State Precision for Low-VRAM SSM Distillation README.md Dynamic State Precision for Low-VRAM SSM Distillation This benchmark evaluates the efficacy of a hardware-aware dynamic precision wrapper applied to the recurrent state of a distilled Mamba-like model. Hypothesis Implementing a dy...	03-29 08:01	Success	-	View
exp_self.20260308002152.059_20260308_002220 Paper: self.20260308002152.059	Distilled SSM with Mixed-Precision State Caching Benchmark README.md Distilled SSM with Mixed-Precision State Caching Benchmark 1. Overview This benchmark validates the Distilled SSM with Mixed-Precision State Caching innovation. The core hypothesis is that a student State Space Model (SSM), tr...	03-29 08:01	Success	-	View
exp_self.20260308002536.060_20260308_002609 Paper: self.20260308002536.060	Cache-Aware Dynamic Precision SSM Distillation Overview This benchmark validates the Cache-Aware Dynamic Precision SSM Distillation methodology. It demonstrates a training loop where a Student SSM (Mamba-like) learns from a Teacher SSM while utilizing two key innovations: 1. **State...	03-29 08:01	Success	-	View
exp_self.20260308002820.061_20260308_003013 Paper: self.20260308002820.061	Benchmark: Memory-Efficient State Distillation for SSM Inference README.md Benchmark: Memory-Efficient State Distillation for SSM Inference Overview This benchmark evaluates the performance gains from "Memory-Efficient State Distillation" applied to State Space Models (SSMs). In standard SSM inference (e...	03-29 08:01	Success	-	View
exp_self.20260308003147.062_20260308_003348 Paper: self.20260308003147.062	Innovation: Selective State Caching for Efficient SSM Distillation README.md Innovation: Selective State Caching for Efficient SSM Distillation Overview This benchmark demonstrates the Selective State Caching mechanism designed to optimize the distillation process of Selective State Space Models (SSMs)...	03-29 08:01	Success	-	View
exp_self.20260308003447.063_20260308_003515 Paper: self.20260308003447.063	Innovation: Memory-Efficient State Space Model Distillation with Dynamic Caching README.md Innovation: Memory-Efficient State Space Model Distillation with Dynamic Caching Overview This benchmark validates a Dynamic Caching strategy for State Space Models (SSMs), specifically focusing on the Mamba architecture durin...	03-29 08:01	Success	-	View
exp_self.20260308003838.064_20260308_004042 Paper: self.20260308003838.064	Here is the runnable benchmark design for the "Efficient SSM Distillation via Selective State Caching" innovation. Design Rationale * Innovation Modeled: Selective State Caching for State Space Models (SSMs). * Scenario: Autoregressive generation (e.g., text generation) where an SSM needs to maintain a hidden state over a long context. * **Basel...	03-29 08:01	Success	-	View
exp_self.20260308004118.065_20260308_004156 Paper: self.20260308004118.065	Efficient Mamba Knowledge Distillation via Selective State-Aware Caching README.md bash pip install torch numpy bash python benchmark.py MODE: Baseline Full-Graph VRAM_USAGE: 2100MB TOKENS_PER_SEC: 1200 ... MODE: Selective State Caching VRAM_USAGE: 1450MB TOKENS_PER_SEC: 1150 ... RESULT: Memory reduced by 30.9%....	03-29 08:01	Success	-	View
exp_self.20260308004451.066_20260308_004528 Paper: self.20260308004451.066	Here is the runnable benchmark code for the "Memory-Efficient Mamba Distillation via Selective State Caching" innovation... README.md	03-29 08:01	Success	-	View
exp_self.20260308004759.067_20260308_004827 Paper: self.20260308004759.067	Dynamic Precision SSM Distillation Benchmark README.md Dynamic Precision SSM Distillation Benchmark This repository contains a minimal, self-contained benchmark designed to evaluate the memory efficiency of Dynamic Precision Selective State Space Models (SSM) during Knowledge Distilla...	03-29 08:01	Success	-	View
exp_self.20260308005213.068_20260308_005422 Paper: self.20260308005213.068	Offline SSM Distillation via Cached State Replay README.md Offline SSM Distillation via Cached State Replay This benchmark implements the "Offline SSM Distillation via Cached State Replay on Memory-Constrained Hardware" concept. Concept Standard Knowledge Distillation requires both the la...	03-29 08:01	Success	-	View
exp_self.20260308005505.069_20260308_005535 Paper: self.20260308005505.069	Memory-Efficient SSM Distillation via Cached State Replay README.md Memory-Efficient SSM Distillation via Cached State Replay This benchmark validates an innovation for training large sequence models on constrained hardware (8GB GPU) by utilizing Cached State Replay during the distillation of...	03-29 08:01	Success	-	View
exp_self.20260308005815.070_20260308_010006 Paper: self.20260308005815.070	Here is the runnable benchmark design for the Memory-Bounded SSM Distillation concept, including the requested documenta... README.md --- Benchmark: Memory-Bounded SSM Distillation via Selective State Caching Overview This benchmark evaluates the performance and memory efficiency of a Selective State Space Model (SSM) against a standard full-history SSM. **T...	03-29 08:01	Success	-	View
exp_self.20260308010051.071_20260308_010133 Paper: self.20260308010051.071	--- README.md --- Benchmark: CPU-Offloaded State Caching for SSM Distillation Overview This benchmark validates the hypothesis that offloading Teacher SSM (State Space Model) recurrent states to system RAM (CPU) during knowledge distillation re...	03-29 08:01	Success	-	View
exp_self.20260308010454.072_20260308_010539 Paper: self.20260308010454.072	CPU-Offloaded SSM State Distillation via Cached Replay README.md CPU-Offloaded SSM State Distillation via Cached Replay Innovation Summary This benchmark demonstrates a training strategy where a large Teacher Mamba model (State Space Model) pre-computes and caches its hidden states to system RA...	03-29 08:01	Success	-	View
exp_self.20260308010727.073_20260308_010807 Paper: self.20260308010727.073	Dynamic-Precision State Caching Benchmark This benchmark tests the "Dynamic-Precision State Caching" innovation designed for efficient SSM (State Space Model) distillation. The core hypothesis is that dynamically reducing the precision of the recurrent state tensor (from FP32 to FP...	03-29 08:01	Success	-	View
exp_self.20260308011137.074_20260308_011211 Paper: self.20260308011137.074	Benchmark: Dynamic-Precision Cached State Distillation for Memory-Efficient SSMs README.md Benchmark: Dynamic-Precision Cached State Distillation for Memory-Efficient SSMs Overview This benchmark tests the hypothesis that applying Dynamic Precision (AMP) specifically to cached recurrent states during the distill...	03-29 08:01	Success	-	View
exp_self.20260308011449.075_20260308_011545 Paper: self.20260308011449.075	Cached State Distillation for Memory-Efficient Mamba Training README.md This benchmark evaluates the hypothesis that implementing a state caching mechanism during the distillation of a Mamba SSM significantly reduces peak GPU memory usage compared to standard backpropagation through time (BPTT). Innov...	03-29 08:01	Success	-	View
exp_self.20260308011747.076_20260308_011820 Paper: self.20260308011747.076	README.md Memory-Efficient SSM Distillation Benchmark Innovation: Memory-Efficient SSM Distillation via Cached State Checkpointing This benchmark tests the hypothesis that implementing gradient checkpointing on a student SSM, combined with a read-onl...	03-29 08:01	Success	-	View
exp_self.20260308012031.077_20260308_012056 Paper: self.20260308012031.077	Memory-Efficient SSM Distillation via Dynamic Precision State Caching This repository contains a benchmarking suite designed to validate the hypothesis that applying dynamic precision (FP16) to the recurrent state cache during the distillation of State Space Models (SSMs) reduces peak GPU memory usage without...	03-29 08:01	Success	-	View
exp_self.20260308012429.078_20260308_012520 Paper: self.20260308012429.078	Here is the design for the benchmark. README.md Memory-Optimized State-Space Model Distillation Benchmark This benchmark evaluates the "Memory-Optimized State-Space Model Distillation via Selective State Caching" innovation. Hypothesis By offloading Teacher hidden states to CPU...	03-29 08:01	Success	-	View
exp_self.20260308012748.079_20260308_012851 Paper: self.20260308012748.079	Memory-Efficient Mamba Distillation Benchmark README.md Memory-Efficient Mamba Distillation Benchmark This benchmark validates the "Memory-Efficient Mamba Distillation" hypothesis. It simulates a distillation process between a large Teacher Mamba and a small Student Mamba. **Key In...	03-29 08:01	Success	-	View
exp_self.20260308013055.080_20260308_013129 Paper: self.20260308013055.080	```markdown README.md	03-29 08:01	Success	-	View
exp_self.20260308013248.081_20260308_013341 Paper: self.20260308013248.081	Memory-Efficient Distillation of Mamba Models via Selective State Caching README.md Memory-Efficient Distillation of Mamba Models via Selective State Caching Overview This benchmark validates the hypothesis that implementing a Selective State Caching mechanism during the distillation of a Mamba-based SSM (Sta...	03-29 08:01	Success	-	View
exp_self.20260308014005.083_20260308_014058 Paper: self.20260308014005.083	```markdown README.md bash pip install torch transformers datasets tqdm bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308015738.084_20260308_015818 Paper: self.20260308015738.084	Benchmark: CPU-Offloaded Selective State Caching for Mamba Distillation README.md Benchmark: CPU-Offloaded Selective State Caching for Mamba Distillation 1. Overview This benchmark validates the "CPU-Offloaded Selective State Caching" strategy for distilling large Mamba-style State Space Models (SSMs) on memory...	03-29 08:01	Success	-	View
exp_self.20260308015931.085_20260308_030123 Paper: self.20260308015931.085	Here is the design for the benchmark evaluating "Low-VRAM Mamba Distillation via Selective State Offloading". This bench... No summary available yet.	03-29 08:01	Success	-	View
exp_self.20260308030156.086_20260308_030218 Paper: self.20260308030156.086	```markdown bash python benchmark.py ``` Expected Outcome The script should run without `RuntimeError: CUDA out of memory`. You will observe high system RAM usage (due to the Teacher) but low, stable GPU VRAM usage (due to the Student-only on-device st...	03-29 08:01	Success	-	View
exp_self.20260308030436.087_20260308_030508 Paper: self.20260308030436.087	This repository contains a runnable benchmark for *Dynamic-Precision Mamba Distillation with CPU-Offloaded State Cache... README.md This repository contains a runnable benchmark for Dynamic-Precision Mamba Distillation with CPU-Offloaded State Cache. Objective The benchmark tests the hypothesis that dynamic precision scaling of SSM states combined with CPU...	03-29 08:01	Success	-	View
exp_self.20260308030729.088_20260308_030759 Paper: self.20260308030729.088	Benchmark: Dynamic-Precision Mamba Distillation with CPU-Offloaded State Caching README.md Benchmark: Dynamic-Precision Mamba Distillation with CPU-Offloaded State Caching Overview This benchmark tests the hypothesis that a student Mamba model can be trained efficiently on limited VRAM (targeting < 8GB) by utilizing **C...	03-29 08:01	Success	-	View
exp_self.20260308031115.089_20260308_031141 Paper: self.20260308031115.089	--- README.md --- Memory-Efficient Mamba Distillation Benchmark This benchmark evaluates the hypothesis that explicitly caching recurrent hidden states during Mamba distillation reduces peak VRAM usage and increases training throughput compared...	03-29 08:01	Success	-	View
exp_self.20260308031332.090_20260308_031534 Paper: self.20260308031332.090	Memory-Efficient Mamba Distillation via Selective State Caching This benchmark evaluates a novel approach to optimizing State Space Models (SSMs), specifically targeting Mamba architectures. The core innovation lies in combining model distillation with a selective state caching mechanism to dras...	03-29 08:01	Success	-	View
exp_self.20260308031743.091_20260308_031954 Paper: self.20260308031743.091	Benchmark: Segmented State Caching for Memory-Efficient Mamba Distillation README.md Benchmark: Segmented State Caching for Memory-Efficient Mamba Distillation Overview This benchmark evaluates a "Segmented State Caching" mechanism designed for State Space Models (SSMs), specifically targeting scenarios involving...	03-29 08:01	Success	-	View
exp_self.20260308032029.092_20260308_032105 Paper: self.20260308032029.092	Here is the design for the runnable benchmark. Section 1: README.md Section 2: benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308032340.093_20260308_032421 Paper: self.20260308032340.093	self.20260308032340.093 No summary available yet.	03-29 08:01	Success	-	View
exp_self.20260308032653.094_20260308_032733 Paper: self.20260308032653.094	--- README.md --- CPU-Offloaded State Distillation for 8GB Mamba Optimization Overview This benchmark implements and tests a novel training strategy for large-context State Space Models (SSMs), specifically targeting hardware constraints (e.g.,...	03-29 08:01	Success	-	View
exp_self.20260308033152.096_20260308_033230 Paper: self.20260308033152.096	Benchmark: Delta-Encoded State Caching for Mamba Distillation README.md Benchmark: Delta-Encoded State Caching for Mamba Distillation Innovation Summary This benchmark validates a memory-efficient distillation pipeline for State Space Models (SSMs), specifically focusing on the `Mamba` architecture. T...	03-29 08:01	Success	-	View
exp_self.20260308033357.097_20260308_033426 Paper: self.20260308033357.097	Recurrent State Caching for Low-Memory Mamba Distillation README.md Recurrent State Caching for Low-Memory Mamba Distillation Overview This benchmark validates the hypothesis that implementing a recurrent state caching strategy during the distillation of SSM-based Mamba models optimizes GPU memory...	03-29 08:01	Success	-	View
exp_self.20260308033644.098_20260308_033719 Paper: self.20260308033644.098	Benchmark: Segmented State Caching for Low-Memory Mamba Distillation README.md Benchmark: Segmented State Caching for Low-Memory Mamba Distillation Overview This benchmark tests the hypothesis that processing input sequences in discrete segments and caching only recurrent state boundaries—detached from the c...	03-29 08:01	Success	-	View
exp_self.20260308033948.099_20260308_034021 Paper: self.20260308033948.099	Here is the design for the "Selective State Retention for Memory-Constrained Mamba Distillation" benchmark. This solution uses a synthetic implementation of the Mamba SSM recurrence logic to ensure the code is runnable immediately without requiring complex CUDA-dependent compilation of the specific `mamba-ssm` library, while accurately demons...	03-29 08:01	Success	-	View
exp_self.20260308034400.100_20260308_034440 Paper: self.20260308034400.100	Title: Mamba Model Distillation with Cached State Retention README.md Title: Mamba Model Distillation with Cached State Retention Abstract: This benchmark evaluates the performance of distilling a pre-trained Mamba-130M State Space Model (SSM) into a smaller student variant. The core innovat...	03-29 08:01	Success	-	View
exp_self.20260308035621.101_20260308_035820 Paper: self.20260308035621.101	Efficient Mamba Distillation via Selective State Caching README.md Efficient Mamba Distillation via Selective State Caching Innovation Overview This benchmark validates the "Efficient Mamba Distillation via Selective State Caching" architecture. While the initial generation was skipped due to emp...	03-29 08:01	Success	-	View
exp_self.20260308035908.102_20260308_035942 Paper: self.20260308035908.102	```markdown bash pip install torch python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308040509.103_20260308_040537 Paper: self.20260308040509.103	Precision-Aware SSM Distillation Benchmark README.md Precision-Aware SSM Distillation Benchmark This repository provides a minimal, self-contained benchmark for evaluating Precision-Aware SSM Distillation with Adaptive State Caching. The Innovation This benchmark tests the hypot...	03-29 08:01	Success	-	View
exp_self.20260308041045.105_20260308_041111 Paper: self.20260308041045.105	```markdown README.md bash pip install torch bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308041307.106_20260308_041338 Paper: self.20260308041307.106	Memory-Efficient SSM Distillation Benchmark README.md Memory-Efficient SSM Distillation Benchmark This repository contains a runnable benchmark demonstrating "Memory-Efficient SSM Distillation via Adaptive Precision State Caching." Overview The benchmark compares a standard training...	03-29 08:01	Success	-	View
exp_self.20260308041520.107_20260308_041722 Paper: self.20260308041520.107	Adaptive-Precision SSM Distillation via State-Space Caching README.md Adaptive-Precision SSM Distillation via State-Space Caching This benchmark evaluates a novel approach to optimizing State Space Models (SSMs) for inference efficiency. The core innovation combines two techniques: 1. **State-Space...	03-29 08:01	Success	-	View
exp_self.20260308041758.108_20260308_042007 Paper: self.20260308041758.108	Backfill Implementation: Cache-Aware Dynamic Precision SSM README.md Backfill Implementation: Cache-Aware Dynamic Precision SSM Original Candidate: `self.20260308041758.108` Status: Backfilled (Original Architect Output was Empty) Overview This benchmark validates the concept of **Cache-Awa...	03-29 08:01	Success	-	View
exp_self.20260308042217.109_20260308_042244 Paper: self.20260308042217.109	Selective State-Space Distillation with Dynamic Precision Caching README.md Selective State-Space Distillation with Dynamic Precision Caching Overview This benchmark validates the hypothesis that a mixed-precision (Dynamic Precision) Student model, utilizing a Selective State-Space Model (SSM) architectur...	03-29 08:01	Success	-	View
exp_self.20260308042452.110_20260308_042514 Paper: self.20260308042452.110	This benchmark evaluates the "Distilled SSM Memory Efficiency via Dynamic Precision Caching" innovation. The goal is to... README.md This benchmark evaluates the "Distilled SSM Memory Efficiency via Dynamic Precision Caching" innovation. The goal is to demonstrate that a Student State-Space Model (SSM), trained via distillation from a Teacher model and utilizin...	03-29 08:01	Success	-	View
exp_self.20260308042824.111_20260308_042916 Paper: self.20260308042824.111	Design for Dynamic Precision State Caching Benchmark README.md This benchmark evaluates the "Dynamic Precision State Caching" hypothesis for State Space Models (SSMs). It simulates a Distilled Mamba-130M-like architecture to demonstrate that storing recurrent hidden states in lower precision...	03-29 08:01	Success	-	View
exp_self.20260308043115.112_20260308_043144 Paper: self.20260308043115.112	Benchmark: Dynamic Precision State Caching for Distilled Mamba Models README.md Benchmark: Dynamic Precision State Caching for Distilled Mamba Models This benchmark evaluates the hypothesis that implementing dynamic precision scaling for the state cache of a Selective State Space Model (SSM/Mamba) can reduce...	03-29 08:01	Success	-	View
exp_self.20260308043437.113_20260308_043504 Paper: self.20260308043437.113	Here is the design for the benchmark. bash pip install torch bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308043726.114_20260308_043806 Paper: self.20260308043726.114	Dynamic Precision State Caching for Distilled Mamba Inference README.md Dynamic Precision State Caching for Distilled Mamba Inference Overview This benchmark demonstrates a simulation of the "Dynamic Precision State Caching" innovation applied to a simplified SSM (State Space Model) architecture, insp...	03-29 08:01	Success	-	View
exp_self.20260308044116.115_20260308_044156 Paper: self.20260308044116.115	Distilled SSM with Dynamic State Precision and Memory Caching README.md Distilled SSM with Dynamic State Precision and Memory Caching Innovation Overview: This benchmark evaluates a hypothesis that a distilled State Space Model (SSM), utilizing dynamic precision on recurrent state caches, can sign...	03-29 08:01	Success	-	View
exp_self.20260308044407.116_20260308_044431 Paper: self.20260308044407.116	```markdown bash pip install torch numpy python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308044747.117_20260308_044822 Paper: self.20260308044747.117	```markdown No summary available yet.	03-29 08:01	Success	-	View
exp_self.20260308044952.118_20260308_045151 Paper: self.20260308044952.118	Here is the design for the "Entropy-Guided Dynamic State Precision for SSM Distillation" benchmark, implemented as a run... No summary available yet.	03-29 08:01	Success	-	View
exp_self.20260308045342.119_20260308_045530 Paper: self.20260308045342.119	State-Aware Dynamic Precision Distillation for SSMs README.md State-Aware Dynamic Precision Distillation for SSMs This benchmark evaluates the effectiveness of State-Aware Dynamic Precision techniques applied to State Space Models (SSMs) running on memory-constrained devices. Background...	03-29 08:01	Success	-	View
exp_self.20260308045625.120_20260308_045645 Paper: self.20260308045625.120	Here are the sections for the runnable benchmark. Cached-State Distillation of Dynamic-Precision SSMs Overview This benchmark evaluates a memory-efficient Knowledge Distillation pipeline for State Space Models (SSMs). It targets environments with strict 8GB VRAM constraints by combining tw...	03-29 08:01	Success	-	View
exp_self.20260308045958.121_20260308_050026 Paper: self.20260308045958.121	Dynamic-Precision State Distillation for Low-Memory SSMs README.md Dynamic-Precision State Distillation for Low-Memory SSMs Innovation Overview This benchmark evaluates a novel technique to enable large context processing on memory-constrained GPUs (8GB limit) by integrating Dynamic Precision...	03-29 08:01	Success	-	View
exp_self.20260308050428.123_20260308_050455 Paper: self.20260308050428.123	Low-Memory SSM Training via Dynamic-Precision State Caching and Distillation README.md Low-Memory SSM Training via Dynamic-Precision State Caching and Distillation Overview This benchmark tests the hypothesis that a Selective State Space Model (SSM) can be trained efficiently on limited VRAM (target < 7.5GB) by impl...	03-29 08:01	Success	-	View
exp_self.20260308050702.124_20260308_050722 Paper: self.20260308050702.124	--- README.md --- Benchmark: Gradient-Checkpointed SSMs with Dynamic State Precision and Distillation Overview This benchmark validates a hypothesis for training State Space Models (SSMs) on memory-constrained GPUs (target: 8GB). It combines th...	03-29 08:01	Success	-	View
exp_self.20260308051512.125_20260308_051746 Paper: self.20260308051512.125	Here is the runnable benchmark for the Low-Memory SSM Distillation via Dynamic-Precision State Caching innovation. bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308051844.126_20260308_051912 Paper: self.20260308051844.126	This repository contains the implementation and benchmarking suite for the research on Dynamic-Precision State Distill... README.md This repository contains the implementation and benchmarking suite for the research on Dynamic-Precision State Distillation for Efficient State Space Models (SSMs)**. Overview This innovation addresses the memory constraints of...	03-29 08:01	Success	-	View
exp_self.20260308052133.127_20260308_052201 Paper: self.20260308052133.127	Efficient SSM Distillation Benchmark README.md Efficient SSM Distillation Benchmark This benchmark evaluates a novel training strategy for State Space Models (SSMs) aimed at reducing GPU memory footprint during knowledge distillation. It tests the hypothesis that applying dyna...	03-29 08:01	Success	-	View
exp_self.20260308052829.129_20260308_052859 Paper: self.20260308052829.129	Dynamic-Precision Cached Distillation for Compact SSMs README.md Dynamic-Precision Cached Distillation for Compact SSMs This repository contains a minimal, runnable benchmark for the paper: "Dynamic-Precision Cached Distillation for Compact SSMs". Overview This benchmark demonstrates a nove...	03-29 08:01	Success	-	View
exp_self.20260308053107.130_20260308_053150 Paper: self.20260308053107.130	Here is the design for the Dynamic-Precision State Distillation benchmark. 1. README.md bash pip install torch python benchmark.py 2. benchmark.py ```python import torch import torch.nn as nn import time import math --- Minimal Mamba-Style SSM Implementation --- class MinimalSSMBlock(nn.Module): """ A minimal SSM...	03-29 08:01	Success	-	View
exp_self.20260308053441.131_20260308_053508 Paper: self.20260308053441.131	Dynamic-Precision State Caching for Distilled SSMs README.md Dynamic-Precision State Caching for Distilled SSMs This benchmark evaluates the memory efficiency and inference speed of a novel Dynamic-Precision State Caching mechanism applied to a distilled State Space Model (SSM). Overvie...	03-29 08:01	Success	-	View
exp_self.20260308053712.132_20260308_053900 Paper: self.20260308053712.132	Here is the runnable benchmark design for the Adaptive Precision State Caching for Distilled SSMs concept. Since the... README.md	03-29 08:01	Success	-	View
exp_self.20260308054504.135_20260308_054527 Paper: self.20260308054504.135	Benchmark: Dynamic-Precision State Caching for Distilled SSMs README.md Benchmark: Dynamic-Precision State Caching for Distilled SSMs This repository contains a runnable synthetic benchmark designed to validate the hypothesis of Dynamic-Precision State Caching for Distilled SSMs. Hypothesis By rep...	03-29 08:01	Success	-	View
exp_self.20260308055214.136_20260308_055401 Paper: self.20260308055214.136	Memory-Efficient Distilled Mamba with Dynamic State Caching README.md Memory-Efficient Distilled Mamba with Dynamic State Caching This benchmark evaluates a Memory-Efficient Distilled Mamba architecture implementing Dynamic State Caching and Dynamic Precision. The Innovation The core inn...	03-29 08:01	Success	-	View
exp_self.20260308055449.137_20260308_055515 Paper: self.20260308055449.137	Here is the design and implementation for the requested benchmark. No summary available yet.	03-29 08:01	Success	-	View
exp_self.20260308055858.138_20260308_060103 Paper: self.20260308055858.138	Here is the benchmark design for the concept described in the title, strictly adhering to your formatting requirements. README.md Adaptive-Precision SSM State Caching Benchmark Overview This benchmark evaluates the Memory-Efficient SSM State Caching innovation. State Space Models (SSMs) require maintaining a hidden state that grows with sequence length o...	03-29 08:01	Success	-	View
exp_self.20260308060154.139_20260308_060214 Paper: self.20260308060154.139	Hybrid-Precision Distilled SSM Benchmark README.md Hybrid-Precision Distilled SSM Benchmark Overview This benchmark validates the "Hybrid-Precision Distilled SSM" innovation. The core hypothesis is that storing the recurrent state tensors of a State Space Model (SSM) in FP16 (half...	03-29 08:01	Success	-	View
exp_self.20260308060518.140_20260308_060541 Paper: self.20260308060518.140	Benchmark: Dynamic-Precision State Caching for Distilled SSMs README.md Benchmark: Dynamic-Precision State Caching for Distilled SSMs This benchmark evaluates the memory efficiency and performance of a Distilled State Space Model (SSM) that utilizes Dynamic-Precision State Caching. Hypothesis...	03-29 08:01	Success	-	View
exp_self.20260308061131.141_20260308_061329 Paper: self.20260308061131.141	Here is the design for the benchmark. Since the original experiment was skipped due to an empty architect output, I have... README.md Cache-Augmented Memory Optimization for Mamba Model Distillation Overview This benchmark implements a framework for distilling knowledge from a large Teacher Mamba model to a smaller Student Mamba model. The specific innovation be...	03-29 08:01	Success	-	View
exp_self.20260308061401.142_20260308_061554 Paper: self.20260308061401.142	Here is the runnable benchmark for the Cache-Augmented Distillation innovation. bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308061738.143_20260308_061800 Paper: self.20260308061738.143	Dynamic-Precision SSM Distillation Benchmark README.md Dynamic-Precision SSM Distillation Benchmark This benchmark evaluates the hypothesis that a Mamba-based State Space Model (SSM) student, distilled from a Transformer teacher using dynamic precision (BF16) and explicit state cachin...	03-29 08:01	Success	-	View
exp_self.20260308063850.001_20260308_064044 Paper: self.20260308063850.001	Benchmark: Dynamic-Precision SSM with Unified State Caching README.md Benchmark: Dynamic-Precision SSM with Unified State Caching This repository contains a benchmark for evaluating the efficiency of Dynamic-Precision State Space Models (SSM) utilizing Unified State Caching. Overview The ben...	03-29 08:01	Success	-	View
exp_self.20260308064650.001_20260308_064851 Paper: self.20260308064650.001	Here is the benchmark design for the Dynamic-Precision SSM Distillation with Unified State Caching innovation. Since the original abstract was empty ("architect_output_empty"), I have synthesized the core logic for the benchmark: 1. SSM (State Space Model): Modeled using a simplified selective recurrent layer to simulate Mamba-like architecture....	03-29 08:01	Success	-	View
exp_self.20260308064929.002_20260308_064956 Paper: self.20260308064929.002	--- README.md --- Dynamic-Precision SSM Distillation Benchmark Overview This benchmark validates the hypothesis that distilling a dense Transformer teacher into a Mamba-style SSM student using dynamic precision (int8 weights, fp16 states) a...	03-29 08:01	Success	-	View
exp_self.20260308065328.003_20260308_065411 Paper: self.20260308065328.003	Memory-Efficient SSM Distillation Benchmark This benchmark tests the hypothesis that offloading recurrent states to CPU during the distillation of a large SSM (Teacher) to a small SSM (Student) reduces VRAM usage significantly while maintaining training throughput. --- README.md Memo...	03-29 08:01	Success	-	View
exp_self.20260308065629.004_20260308_065716 Paper: self.20260308065629.004	State-Aligned Mamba Distillation Benchmark README.md State-Aligned Mamba Distillation Benchmark This benchmark evaluates State-Aligned Mamba Distillation, a technique designed to train efficient student Mamba models by aligning their internal recurrent states with a larger teach...	03-29 08:01	Success	-	View
exp_self.20260308065913.005_20260308_065941 Paper: self.20260308065913.005	Here is the design for the benchmarking code focusing on CPU-offloaded state caching for Mamba distillation. README.md Benchmark: CPU-Offloaded State Caching for Efficient Mamba Distillation This benchmark validates the hypothesis that CPU-offloading teacher states during SSM (State Space Model) distillation significantly reduces GPU VRAM consumpt...	03-29 08:01	Success	-	View
exp_self.20260308070311.006_20260308_070343 Paper: self.20260308070311.006	Dynamic Precision SSM Distillation with Hierarchical Memory Caching README.md Dynamic Precision SSM Distillation with Hierarchical Memory Caching This repository contains the benchmarking suite for the Dynamic Precision SSM Distillation innovation. Overview This innovation aims to enable efficient proce...	03-29 08:01	Success	-	View
exp_self.20260308070557.007_20260308_070631 Paper: self.20260308070557.007	Dynamic Precision SSM Distillation Benchmark README.md Dynamic Precision SSM Distillation Benchmark This benchmark evaluates a novel training strategy for Selective State Space Models (SSMs), specifically testing the hypothesis that applying Dynamic Precision techniques to the...	03-29 08:01	Success	-	View
exp_self.20260308070842.008_20260308_070915 Paper: self.20260308070842.008	Here is the runnable benchmark design for the "GPU-Efficient Distilled SSM" innovation. README.md GPU-Efficient Distilled SSM with Dynamic State Caching Overview This benchmark evaluates a novel training approach for State Space Models (SSMs) designed for resource-constrained environments (e.g., 8GB GPUs). It combines Knowledg...	03-29 08:01	Success	-	View
exp_self.20260308071226.009_20260308_071304 Paper: self.20260308071226.009	This benchmark evaluates a Layer-wise Dynamic Precision SSM against a standard Transformer baseline. README.md This benchmark evaluates a Layer-wise Dynamic Precision SSM against a standard Transformer baseline. Hypothesis By monitoring gradient norms, we can dynamically cast stable layers of a State Space Model (SSM) to FP16 (simulate...	03-29 08:01	Success	-	View
exp_self.20260308071555.010_20260308_071653 Paper: self.20260308071555.010	Memory-Efficient SSM Distillation Benchmark README.md Memory-Efficient SSM Distillation Benchmark This benchmark evaluates the "Memory-Efficient SSM Distillation with Dynamic Precision and State Caching" innovation. Goal: Demonstrate that a lightweight State Space Model (SSM) stu...	03-29 08:01	Success	-	View
exp_self.20260308071929.011_20260308_072001 Paper: self.20260308071929.011	Low-Memory SSM Distillation Benchmark README.md Low-Memory SSM Distillation Benchmark This benchmark evaluates the effectiveness of Dynamic State Precision for State Space Models (SSMs) during knowledge distillation. Hypothesis Dynamically down-casting recurrent state tenso...	03-29 08:01	Success	-	View
exp_self.20260308072213.012_20260308_072241 Paper: self.20260308072213.012	SSM Distillation with Dynamic Precision State Caching This repository contains a minimal, runnable benchmark designed to validate the "SSM Distillation with Dynamic Precision State Caching" innovation. Hypothesis Implementing a dynamic precision cache for recurrent states during SSM distillati...	03-29 08:01	Success	-	View
exp_self.20260308072603.013_20260308_072640 Paper: self.20260308072603.013	Benchmark: SSM Distillation via Recurrent State Caching README.md Benchmark: SSM Distillation via Recurrent State Caching Overview This benchmark validates the memory efficiency of Recurrent State Caching during the distillation of State Space Models (SSMs). Specifically, it tests the hypoth...	03-29 08:01	Success	-	View
exp_self.20260308072849.014_20260308_072924 Paper: self.20260308072849.014	Efficient SSM Distillation via Static State Caching README.md Efficient SSM Distillation via Static State Caching Overview This benchmark demonstrates the innovation of Static State Caching during the distillation of Mamba-based State Space Models (SSMs). The Hypothesis: By freezing...	03-29 08:01	Success	-	View
exp_self.20260308073322.015_20260308_073357 Paper: self.20260308073322.015	Memory-Efficient SSM Distillation via CPU Offloaded State Caching README.md Memory-Efficient SSM Distillation via CPU Offloaded State Caching Benchmark Overview This benchmark evaluates the hypothesis that pre-computing Teacher SSM states and offloading them to CPU system RAM allows for memory-efficie...	03-29 08:01	Success	-	View
exp_self.20260308073628.016_20260308_073723 Paper: self.20260308073628.016	Here is the design for the benchmark. README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308073923.017_20260308_074006 Paper: self.20260308073923.017	Efficient Mamba Distillation with CPU-Offloaded State Cache README.md Efficient Mamba Distillation with CPU-Offloaded State Cache Overview This benchmark validates an innovation designed to enable large-context processing on memory-constrained GPUs (e.g., 8GB VRAM) by combining model distillation wi...	03-29 08:01	Success	-	View
exp_self.20260308075306.018_20260308_075352 Paper: self.20260308075306.018	SSM Distillation with Selective State Caching README.md SSM Distillation with Selective State Caching Overview This benchmark demonstrates a memory-efficient distillation pipeline for State Space Models (SSMs), specifically Mamba-style architectures. The Innovation: Standard backpr...	03-29 08:01	Success	-	View
exp_self.20260308075611.019_20260308_075715 Paper: self.20260308075611.019	```markdown README.md bash pip install torch transformers python benchmark.py MODE: CPU_OFFLOAD_Q8 VRAM_USAGE: 450MB TOKENS_PER_SEC: 1200 RESULT: SUCCESS (Memory Optimized, Loss Converged) ---	03-29 08:01	Success	-	View
exp_self.20260308075923.020_20260308_080004 Paper: self.20260308075923.020	Memory-Efficient State-Space Distillation Benchmark README.md Memory-Efficient State-Space Distillation Benchmark This benchmark demonstrates a memory-efficient training strategy for State-Space Models (SSMs), specifically tailored for Mamba-like architectures. The innovation, **Recurrent Ca...	03-29 08:01	Success	-	View
exp_self.20260308080247.021_20260308_080331 Paper: self.20260308080247.021	Here is the runnable benchmark design for the State-Space Distillation innovation. README.md State-Space Distillation via Latent Memory Alignment Hypothesis Distilling the internal recurrent memory states of a teacher State Space Model (SSM) into a smaller student model yields superior accuracy compared to standard logit-...	03-29 08:01	Success	-	View
exp_self.20260308080540.022_20260308_080639 Paper: self.20260308080540.022	Benchmark Design: SSM-Mamba Distillation with Segment-Based Latent Caching This benchmark evaluates a simplified Mamba-style State Space Model (SSM) implementation where a large Teacher model distills knowledge into a smaller Student model. To handle long-context sequences without exceeding VRAM, we utilize a segm...	03-29 08:01	Success	-	View
exp_self.20260308081351.024_20260308_081432 Paper: self.20260308081351.024	Section 1: README.md bash python benchmark.py MODE: baseline VRAM_USAGE: <value>MB TOKENS_PER_SEC: <value> ... MODE: innovation VRAM_USAGE: <value>MB TOKENS_PER_SEC: <value> ... RESULT: Memory reduction of <percentage>% achieved. ```	03-29 08:01	Success	-	View
exp_self.20260308081824.025_20260308_081908 Paper: self.20260308081824.025	Here is the design for the benchmark evaluating Dynamic Precision State-Space Distillation with Cache Optimization. Design Philosophy The benchmark implements a minimal but functionally accurate State-Space Model (SSM) layer that mimics the recurrent memory behavior of Mamba architectures. 1. Models: A Teacher (large) and a Student (small) SSM ar...	03-29 08:01	Success	-	View
exp_self.20260308082217.026_20260308_082258 Paper: self.20260308082217.026	Dynamic Precision State-Space Distillation with Adaptive Caching This repository contains a benchmark for the proposed "Dynamic Precision State-Space Distillation" technique. The goal is to demonstrate that utilizing adaptive precision (FP16/FP8) for the recurrent state tensors ($h_t$) in a State Space M...	03-29 08:01	Success	-	View
exp_self.20260308082455.027_20260308_082708 Paper: self.20260308082455.027	Here is a runnable benchmark designed for the Hybrid SSM-Transformer with Dynamic Precision Caching concept. Since the original experiment output was empty, I have synthesized a representative architecture that combines: 1. Hybrid Layers: Alternating blocks of Standard Attention (Transformer) and Selective State Space (Mamba-like) blocks. 2. *...	03-29 08:01	Success	-	View
exp_self.20260308082751.028_20260308_082940 Paper: self.20260308082751.028	Innovation: Dynamic Precision SSM Distillation with Selective Memory Caching README.md Innovation: Dynamic Precision SSM Distillation with Selective Memory Caching This benchmark evaluates the efficiency of a theoretical distilled State Space Model (SSM) that employs two primary optimization strategies: 1. **Dynamic...	03-29 08:01	Success	-	View
exp_self.20260308083059.029_20260308_083130 Paper: self.20260308083059.029	Dynamic Precision SSM Distillation Benchmark README.md Dynamic Precision SSM Distillation Benchmark Innovation: Dynamic Precision SSM Distillation with Recurrent State Caching Hypothesis: We hypothesize that distilling a lightweight State Space Model (SSM) student from a froze...	03-29 08:01	Success	-	View
exp_self.20260308083509.030_20260308_083535 Paper: self.20260308083509.030	```markdown bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308083731.031_20260308_083802 Paper: self.20260308083731.031	This repository contains a standalone benchmark to evaluate the efficiency gains of Dynamic Precision SSM Distillation... README.md This repository contains a standalone benchmark to evaluate the efficiency gains of Dynamic Precision SSM Distillation with Recurrent State Caching**. Overview State Space Models (SSMs), such as Mamba, offer significant potentia...	03-29 08:01	Success	-	View
exp_self.20260308084022.032_20260308_084110 Paper: self.20260308084022.032	Here are the two sections as requested. README.md bash pip install torch numpy bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308084335.033_20260308_084415 Paper: self.20260308084335.033	Dynamic Precision SSM Distillation with State Memory Caching README.md Dynamic Precision SSM Distillation with State Memory Caching Overview This benchmark evaluates a novel approach to training State Space Models (SSMs) by combining Dynamic Precision (Automatic Mixed Precision - AMP) with **Stat...	03-29 08:01	Success	-	View
exp_self.20260308084629.034_20260308_084704 Paper: self.20260308084629.034	Dynamic Precision SSM Distillation with Detached State Caching README.md Dynamic Precision SSM Distillation with Detached State Caching This repository contains a benchmark implementation designed to validate the hypothesis that a detached recurrent state cache strategy, combined with Automatic Mixed P...	03-29 08:01	Success	-	View
exp_self.20260308085103.035_20260308_085138 Paper: self.20260308085103.035	```markdown bash pip install torch python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308090156.036_20260308_090224 Paper: self.20260308090156.036	```markdown README.md bash pip install torch bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308090431.037_20260308_090514 Paper: self.20260308090431.037	```markdown bash python benchmark.py ``` Expected Output The script outputs: * VRAM_USAGE: Peak memory allocated during the operation. * TOKENS_PER_SEC: Throughput measured in tokens generated per second. * RESULT: A final verification comp...	03-29 08:01	Success	-	View
exp_self.20260308090818.038_20260308_091011 Paper: self.20260308090818.038	Here is the runnable benchmark design. README.md Dynamic Precision SSM with State Caching: Efficiency Benchmark Overview This benchmark evaluates the proposed innovation: Dynamic Precision SSM Distillation with Cached State Memory. The goal is to demonstrate the efficiency g...	03-29 08:01	Success	-	View
exp_self.20260308091114.039_20260308_091138 Paper: self.20260308091114.039	Here is the runnable benchmark code. bash pip install torch bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308091518.040_20260308_091552 Paper: self.20260308091518.040	Section 1: README.md bash pip install torch bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308091816.041_20260308_091853 Paper: self.20260308091816.041	Efficient SSM Distillation Benchmark README.md Efficient SSM Distillation Benchmark This benchmark evaluates the "Efficient SSM Distillation" innovation. The core hypothesis is that a student State Space Model (SSM) can maintain training stability comparable to a Transformer t...	03-29 08:01	Success	-	View
exp_self.20260308092146.042_20260308_092256 Paper: self.20260308092146.042	--- README.md Memory-Efficient SSM Distillation via Dynamic State Caching Overview This benchmark evaluates a knowledge distillation pipeline where a Transformer Teacher model trains a State Space Model (SSM) Student. The core innovation te...	03-29 08:01	Success	-	View
exp_self.20260308092508.043_20260308_092557 Paper: self.20260308092508.043	Dynamic-Precision SSM Distillation Benchmark README.md Dynamic-Precision SSM Distillation Benchmark This benchmark validates the hypothesis that Dynamic-Precision SSM Distillation with Selective State Caching reduces GPU memory usage for long-context sequences while maintaining ac...	03-29 08:01	Success	-	View
exp_self.20260308092826.044_20260308_092918 Paper: self.20260308092826.044	Adaptive-Precision SSM Distillation Benchmark README.md Adaptive-Precision SSM Distillation Benchmark This repository contains the benchmarking code for evaluating Adaptive-Precision SSM Distillation with Cached State Memory. Hypothesis Implementing layer-wise dynamic precision adj...	03-29 08:01	Success	-	View
exp_self.20260308093127.045_20260308_093201 Paper: self.20260308093127.045	Low-Resource SSM Distillation Benchmark README.md Low-Resource SSM Distillation Benchmark Overview This benchmark evaluates the hypothesis that a lightweight Selective State Space Model (SSM), utilizing Selective State Caching and Dynamic Precision (AMP) training, can pro...	03-29 08:01	Success	-	View
exp_self.20260308094910.046_20260308_094952 Paper: self.20260308094910.046	```markdown bash python benchmark.py ``` 3. The script will output VRAM usage, processing speed, and final verification results.	03-29 08:01	Success	-	View
exp_self.20260308095221.047_20260308_095310 Paper: self.20260308095221.047	--- README.md --- VRAM-Efficient SSM Distillation Benchmark This benchmark validates the VRAM-Efficient SSM Distillation innovation, which utilizes Adaptive State Quantization and Selective Caching to reduce memory footprint during the trai...	03-29 08:01	Success	-	View
exp_self.20260308095539.048_20260308_095613 Paper: self.20260308095539.048	Adaptive State Distillation for Memory-Constrained Mamba Models README.md Adaptive State Distillation for Memory-Constrained Mamba Models Innovation Overview This benchmark demonstrates a novel training strategy for State Space Models (specifically Mamba). The hypothesis is that distilling a large Teach...	03-29 08:01	Success	-	View
exp_self.20260308095844.049_20260308_095917 Paper: self.20260308095844.049	Adaptive State Distillation Benchmark README.md Adaptive State Distillation Benchmark This benchmark evaluates the Adaptive State Distillation technique designed to train large State Space Models (SSMs) on memory-constrained hardware (8GB VRAM). Methodology The code impleme...	03-29 08:01	Success	-	View
exp_self.20260308100138.050_20260308_100331 Paper: self.20260308100138.050	Here is the design for the benchmark based on the "Dynamic Precision State Distillation" concept. Since the original arc... README.md Benchmark: Dynamic Precision SSM Inference Overview This benchmark evaluates the memory efficiency and throughput of Dynamic Precision State Distillation concepts on State Space Models (SSMs). Since the target architecture was...	03-29 08:01	Success	-	View
exp_self.20260308100426.051_20260308_100609 Paper: self.20260308100426.051	Benchmark: Dynamic Precision State Distillation for VRAM-Constrained SSMs README.md Benchmark: Dynamic Precision State Distillation for VRAM-Constrained SSMs Overview This benchmark evaluates the "Dynamic Precision State Distillation" technique applied to a synthetic State Space Model (SSM). The core innovation i...	03-29 08:01	Success	-	View
exp_self.20260308100758.052_20260308_100829 Paper: self.20260308100758.052	Dynamic Precision State Cache Distillation for SSMs README.md Dynamic Precision State Cache Distillation for SSMs Innovation Overview This benchmark demonstrates a novel technique to optimize State Space Models (SSMs) for deployment on consumer-grade hardware (8GB VRAM). By applying **Dynami...	03-29 08:01	Success	-	View
exp_self.20260308101059.053_20260308_101125 Paper: self.20260308101059.053	--- README.md Dynamic Precision State Caching for Distilled SSMs Overview This benchmark evaluates the "Dynamic Precision State Caching" innovation applied to a distilled Mamba-style Selective State Space Model (SSM). Hypothesis: By dynamic...	03-29 08:01	Success	-	View
exp_self.20260308101507.054_20260308_101548 Paper: self.20260308101507.054	Benchmark: Dynamic Precision State Cache for Memory-Efficient SSM Distillation README.md Benchmark: Dynamic Precision State Cache for Memory-Efficient SSM Distillation 1. Objective This benchmark evaluates the hypothesis that Dynamic Precision State Caching significantly reduces the peak VRAM consumption of State...	03-29 08:01	Success	-	View
exp_self.20260308102733.055_20260308_102801 Paper: self.20260308102733.055	--- FILE_BREAK--- Benchmark: Phase-Shifted Distillation for Low-Precision SSMs Overview This benchmark validates the "Phase-Shifted Distillation" hypothesis. It tests whether a dynamic precision schedule applied to a Student State Space Model (...	03-29 08:01	Success	-	View
exp_self.20260308103017.056_20260308_103051 Paper: self.20260308103017.056	Dynamic Precision State Caching for Distilled SSMs README.md Dynamic Precision State Caching for Distilled SSMs Overview This benchmark evaluates the "Dynamic Precision State Caching" innovation for State Space Models (SSMs), specifically targeting memory-constrained hardware. **Hypothesis:...	03-29 08:01	Success	-	View
exp_self.20260308103400.057_20260308_103436 Paper: self.20260308103400.057	Section 1: README.md Adaptive Precision State Caching for Distilled SSMs Overview This benchmark validates the "Adaptive Precision State Caching" innovation applied to a distilled State Space Model (SSM). The core hypothesis is that by storing the recurrent hid...	03-29 08:01	Success	-	View
exp_self.20260308103547.058_20260308_103622 Paper: self.20260308103547.058	--- README.md --- Dynamic Precision State Caching for Distilled SSMs Overview This benchmark implements a lightweight, custom Selective State Space Model (SSM) inspired by Mamba. It demonstrates a memory-efficient training strategy combining **...	03-29 08:01	Success	-	View
exp_self.20260308103826.059_20260308_103907 Paper: self.20260308103826.059	Tiered-Precision State Distillation Benchmark README.md Tiered-Precision State Distillation Benchmark This benchmark validates the memory efficiency of a Tiered-Precision State Caching mechanism for State Space Models (SSMs). Hypothesis By implementing a tiered caching mechanism that d...	03-29 08:01	Success	-	View
exp_self.20260308104114.060_20260308_104158 Paper: self.20260308104114.060	--- README.md Distilled Adaptive-Precision State Caching for Memory-Efficient SSMs This repository contains the benchmark implementation for the "Distilled Adaptive-Precision State Caching" innovation. Overview This project demonstrates a novel...	03-29 08:01	Success	-	View
exp_self.20260308104421.061_20260308_104451 Paper: self.20260308104421.061	```markdown README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308104741.062_20260308_104803 Paper: self.20260308104741.062	Section 1: README.md Adaptive-Precision Distilled State Caching for Memory-Bound SSMs Benchmark Overview This benchmark evaluates a novel memory optimization technique for State Space Models (SSMs). The innovation combines knowledge distillation with a **ti...	03-29 08:01	Success	-	View
exp_self.20260308105024.063_20260308_105047 Paper: self.20260308105024.063	Distilled State-Space Models with Temporal Dynamic Precision Caching README.md bash pip install torch tqdm bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308105413.064_20260308_105454 Paper: self.20260308105413.064	Low-Memory Distilled SSMs via Tiered Dynamic Precision Caching README.md Low-Memory Distilled SSMs via Tiered Dynamic Precision Caching Overview This benchmark evaluates a novel memory optimization technique for State-Space Models (SSMs) during long-context inference. The innovation involves "Tiered Dy...	03-29 08:01	Success	-	View
exp_self.20260308105706.065_20260308_110021 Paper: self.20260308105706.065	Benchmark: Tiered-Precision State Caching for SSMs README.md Benchmark: Tiered-Precision State Caching for SSMs Overview This benchmark evaluates the efficacy of Tiered-Precision State Caching, a technique designed to optimize memory usage and inference speed for Long-Context State Spac...	03-29 08:01	Success	-	View
exp_self.20260308110112.066_20260308_110316 Paper: self.20260308110112.066	Here is the runnable benchmark code designed for the Tiered-Precision Distilled Mamba concept. README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308110352.067_20260308_110415 Paper: self.20260308110352.067	Efficient Mamba Distillation Benchmark This benchmark evaluates the "Dynamic Precision State Caching" technique applied to a distilled Student-Teacher Mamba pipeline. Hypothesis Storing the recurrent hidden state `h_t` in `bfloat16` instead of `float32` reduces peak VRAM usage d...	03-29 08:01	Success	-	View
exp_self.20260308111607.068_20260308_111641 Paper: self.20260308111607.068	Dynamic Precision Mamba Distillation Benchmark README.md Dynamic Precision Mamba Distillation Benchmark This repository contains a benchmark designed to evaluate the efficiency gains of a Dynamic Precision Mamba model distilled from a larger Transformer teacher, utilizing a **Persis...	03-29 08:01	Success	-	View
exp_self.20260308111853.069_20260308_112037 Paper: self.20260308111853.069	Here is the benchmark design based on the provided internal policies and the "Dynamic Precision State Space Distillation... README.md --- Benchmark: Dynamic Precision SSM with Adaptive Caching Overview This benchmark evaluates the performance characteristics of a State Space Model (SSM) enhanced with Dynamic Precision and Adaptive Caching mechanisms....	03-29 08:01	Success	-	View
exp_self.20260308112212.070_20260308_112247 Paper: self.20260308112212.070	Section 1: README.md Section 2: benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308112501.071_20260308_112524 Paper: self.20260308112501.071	Benchmark: Dynamic Precision Distilled SSM README.md Benchmark: Dynamic Precision Distilled SSM Overview This benchmark evaluates a Dynamic Precision Distilled State Space Model (SSM). The core hypothesis is that selectively applying lower precision (bfloat16/float16) to the rec...	03-29 08:01	Success	-	View
exp_self.20260308113518.001_20260308_113547 Paper: self.20260308113518.001	```markdown README.md bash pip install torch tqdm bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308113759.002_20260308_113824 Paper: self.20260308113759.002	Dynamic Precision SSM & Caching Distillation Benchmark README.md Dynamic Precision SSM & Caching Distillation Benchmark This benchmark validates the hypothesis that a Dynamic Precision Selective State Space Model (SSM) with Memory-Efficient Caching significantly reduces GPU memory footp...	03-29 08:01	Success	-	View
exp_self.20260308114155.003_20260308_114225 Paper: self.20260308114155.003	Here is the design for the runnable benchmark. README.md Mixed-Precision Cached State Distillation Benchmark This repository contains a minimal, runnable benchmark designed to validate the hypothesis that Dynamic Precision and State Caching can significantly reduce VRAM usage an...	03-29 08:01	Success	-	View
exp_self.20260308114440.004_20260308_114507 Paper: self.20260308114440.004	Dynamic Precision Distilled SSM Benchmark README.md Dynamic Precision Distilled SSM Benchmark This benchmark evaluates the hypothesis that a Student State Space Model (SSM), utilizing dynamic precision and a segment-aware state cache, achieves lower peak VRAM usage and higher infer...	03-29 08:01	Success	-	View
exp_self.20260308114844.005_20260308_114905 Paper: self.20260308114844.005	Section 1: README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308115116.006_20260308_115154 Paper: self.20260308115116.006	Efficient Distillation of Memory-Cached SSMs README.md Efficient Distillation of Memory-Cached SSMs This benchmark demonstrates the efficiency gains of applying Dynamic Precision and State Caching to a student State Space Model (SSM) that has been distilled from a larger teach...	03-29 08:01	Success	-	View
exp_self.20260308120841.001_20260308_120920 Paper: self.20260308120841.001	Adaptive Precision Caching for SSM Distillation Benchmark README.md Adaptive Precision Caching for SSM Distillation Benchmark This repository contains a synthetic benchmark designed to validate the "Adaptive Precision Caching for SSM Distillation" hypothesis. It simulates a State Space Model (SSM)...	03-29 08:01	Success	-	View
exp_self.20260308121148.002_20260308_121224 Paper: self.20260308121148.002	Dynamic State Precision for Low-Memory SSM Distillation README.md Dynamic State Precision for Low-Memory SSM Distillation Overview This benchmark validates the hypothesis that dynamically reducing the numerical precision of recurrent state tensors (the SSM cache) during training allows for proce...	03-29 08:01	Success	-	View
exp_self.20260308121501.003_20260308_121531 Paper: self.20260308121501.003	Dynamic-Precision SSM Distillation Benchmark README.md Dynamic-Precision SSM Distillation Benchmark This repository contains a minimal, runnable benchmark to evaluate the efficiency of Dynamic-Precision State Space Models (SSM) combined with Knowledge Distillation and **Cached...	03-29 08:01	Success	-	View
exp_self.20260308121814.004_20260308_121842 Paper: self.20260308121814.004	Low-Bit State Caching for Distilled Mamba Inference README.md Low-Bit State Caching for Distilled Mamba Inference This benchmark validates the hypothesis that applying dynamic precision quantization to the recurrent state cache of a distilled Mamba-style State Space Model (SSM) significantly...	03-29 08:01	Success	-	View
exp_self.20260308122151.005_20260308_122216 Paper: self.20260308122151.005	Dynamic-Precision State Caching for Memory-Efficient SSM Distillation README.md Dynamic-Precision State Caching for Memory-Efficient SSM Distillation Overview This benchmark evaluates the hypothesis that applying dynamic precision reduction to the recurrent state caches of a Student State Space Model (SSM) du...	03-29 08:01	Success	-	View
exp_self.20260308122430.006_20260308_122746 Paper: self.20260308122430.006	```markdown README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308122833.007_20260308_123025 Paper: self.20260308122833.007	Benchmark: Cache-Augmented Dynamic Precision SSM Distillation README.md Benchmark: Cache-Augmented Dynamic Precision SSM Distillation Overview This benchmark validates the "Backfill Candidate" concept for Cache-Augmented Dynamic Precision SSM Distillation. Although the original experiment (`self.2...	03-29 08:01	Success	-	View
exp_self.20260308123118.008_20260308_123150 Paper: self.20260308123118.008	Cache-Augmented Dynamic Precision SSM Distillation README.md Cache-Augmented Dynamic Precision SSM Distillation This repository contains a runnable benchmark demonstrating the Cache-Augmented Dynamic Precision SSM Distillation technique. Abstract This innovation hypothesizes that applyi...	03-29 08:01	Success	-	View
exp_self.20260308123338.009_20260308_123532 Paper: self.20260308123338.009	Benchmark: Memory-Efficient Distilled SSM with Dynamic Precision README.md Benchmark: Memory-Efficient Distilled SSM with Dynamic Precision This benchmark evaluates the performance characteristics of a synthetic State Space Model (SSM) architecture designed for memory efficiency and dynamic precision...	03-29 08:01	Success	-	View
exp_self.20260308123622.010_20260308_123649 Paper: self.20260308123622.010	Adaptive Precision Distilled SSM with State Caching README.md Adaptive Precision Distilled SSM with State Caching Overview This benchmark demonstrates an innovative approach to efficient Large Language Model (LLM) training and inference. It validates the hypothesis that distilling a dense Tr...	03-29 08:01	Success	-	View
exp_self.20260308124927.011_20260308_125115 Paper: self.20260308124927.011	Memory-Constrained Dynamic Precision Distillation for SSMs README.md Memory-Constrained Dynamic Precision Distillation for SSMs Overview This benchmark evaluates a Dynamic Precision strategy for State Space Models (SSMs). Traditional Large Language Models (LLMs) rely on KV-caches which grow qua...	03-29 08:01	Success	-	View
exp_self.20260308125208.012_20260308_125236 Paper: self.20260308125208.012	Mixed-Precision SSM Distillation with State Caching README.md Mixed-Precision SSM Distillation with State Caching Innovation Overview This benchmark evaluates a Mixed-Precision Student-Teacher Distillation pipeline designed to optimize State Space Models (SSMs) on memory-constrained hard...	03-29 08:01	Success	-	View
exp_self.20260308125532.013_20260308_125600 Paper: self.20260308125532.013	Dynamic Precision SSM Distillation Benchmark README.md Dynamic Precision SSM Distillation Benchmark This repository contains a standalone benchmark designed to test the hypothesis that State Space Model (SSM) distillation combined with Dynamic Precision State Caching can significa...	03-29 08:01	Success	-	View
exp_self.20260308125834.014_20260308_125900 Paper: self.20260308125834.014	Dynamic Precision SSM Distillation with Logit Caching README.md Dynamic Precision SSM Distillation with Logit Caching Overview This benchmark demonstrates a novel approach to Knowledge Distillation (KD) designed for hardware-constrained environments (e.g., 8GB GPUs). It combines a Transformer-...	03-29 08:01	Success	-	View
exp_self.20260308130119.015_20260308_130316 Paper: self.20260308130119.015	Benchmark: Dynamic Precision SSM with State Caching README.md Benchmark: Dynamic Precision SSM with State Caching This benchmark evaluates the performance characteristics of a simulated State Space Model (SSM) augmented with Dynamic Precision and State Caching mechanisms. Overview Th...	03-29 08:01	Success	-	View
exp_self.20260308132907.003_20260308_132934 Paper: self.20260308132907.003	FP8 Dynamic State Quantization Benchmark README.md FP8 Dynamic State Quantization Benchmark This benchmark evaluates the FP8 Dynamic State Quantization innovation. The core hypothesis is that the recurrent state memory bandwidth in State Space Models (SSMs) like Mamba is a bot...	03-29 08:01	Success	-	View
exp_self.20260308133122.004_20260308_133153 Paper: self.20260308133122.004	Hybrid Attention-SSM with Cross-Layer State Recycling README.md Hybrid Attention-SSM with Cross-Layer State Recycling Hypothesis The Attention mechanism captures rich local context. Projecting the final Attention KV-cache into the initial SSM state $h_0$ will result in faster convergence and l...	03-29 08:01	Success	-	View
exp_self.20260308133546.005_20260308_133737 Paper: self.20260308133546.005	Benchmark: SSM + Cache Co-design vs Standard Attention README.md Benchmark: SSM + Cache Co-design vs Standard Attention This benchmark evaluates the memory efficiency and inference speed of a State Space Model (SSM) augmented with a cache-co-design strategy against a standard Transformer-st...	03-29 08:01	Success	-	View
exp_self.20260308133800.006_20260308_134243 Paper: self.20260308133800.006	Benchmark: SSM + Cache Co-design vs. Standard Attention README.md Benchmark: SSM + Cache Co-design vs. Standard Attention Overview This benchmark evaluates the performance characteristics of a simulated State Space Model (SSM) with Cache Co-design against a standard Transformer Attention...	03-29 08:01	Success	-	View
exp_self.20260308134341.007_20260308_134420 Paper: self.20260308134341.007	Entropy-Gated State Caching for SSMs README.md Entropy-Gated State Caching for SSMs Innovation This benchmark explores an optimization technique for State Space Models (SSMs) such as Mamba. The core hypothesis is that not every token in a sequence requires a full-precision sta...	03-29 08:01	Success	-	View
exp_self.20260308134543.008_20260308_134621 Paper: self.20260308134543.008	Benchmark: Cross-Layer State Recycling (Tied States) README.md Benchmark: Cross-Layer State Recycling (Tied States) Overview This benchmark tests the hypothesis that sharing recurrent state memory between sequential layers (Cross-Layer Tying) can significantly reduce VRAM usage with minimal i...	03-29 08:01	Success	-	View
exp_self.20260308134914.009_20260308_135005 Paper: self.20260308134914.009	--- Combining SSM + Cache + Memory will improve throughput or memory efficiency without breaking 8GB execution.	03-29 08:01	Success	-	View
exp_self.20260308135202.010_20260308_135223 Paper: self.20260308135202.010	Associative State Memory (ASM) Retrieval Benchmark This benchmark evaluates the "Associative State Memory (ASM)" innovation. The core hypothesis is that augmenting a State Space Model (SSM) with a non-recurrent, associative memory bank (using KNN lookup) improves recall capabilities with ac...	03-29 08:01	Success	-	View
exp_self.20260308135530.011_20260308_135559 Paper: self.20260308135530.011	CPU-Pinned State Streaming (CPSS) Benchmark README.md CPU-Pinned State Streaming (CPSS) Benchmark Overview This benchmark validates the CPU-Pinned State Streaming (CPSS) innovation. The core hypothesis is that offloading the SSM (State Space Model) recurrent state tensor to pinne...	03-29 08:01	Success	-	View
exp_self.20260308135717.012_20260308_135752 Paper: self.20260308135717.012	Cross-Layer State Sharing via Memory Cache README.md Cross-Layer State Sharing via Memory Cache This benchmark validates the hypothesis that deep State Space Models (SSMs) re-learn similar features at different depths. By explicitly caching and injecting the state from Layer $N$ int...	03-29 08:01	Success	-	View
exp_self.20260308135903.013_20260308_135937 Paper: self.20260308135903.013	Sparse Associative State Cache (SAS-Cache) This repository contains the reference implementation and benchmark for the Sparse Associative State Cache (SAS-Cache). Hypothesis Standard State Space Models (SSMs) like Mamba compress the entire history into a fixed hidden state. Whil...	03-29 08:01	Success	-	View
exp_self.20260308140228.014_20260308_140257 Paper: self.20260308140228.014	```markdown bash pip install torch tqdm bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308140501.016_20260308_140529 Paper: self.20260308140501.016	```markdown Student Hypothesis Benchmark: SSM + Cache + Memory Co-design Hypothesis Combining SSM (State Space Models), Cache (State retention), and Memory (Gradient Checkpointing/Precision) optimizations will improve throughput and memory...	03-29 08:01	Success	-	View
exp_self.20260308140650.017_20260308_140722 Paper: self.20260308140650.017	Entropy-Driven Dynamic Quantization for SSM States README.md Entropy-Driven Dynamic Quantization for SSM States Overview This benchmark explores the hypothesis that State Space Models (SSMs) do not require full precision (FP16) for their recurrent states when processing predictable, low-ent...	03-29 08:01	Success	-	View
exp_self.20260308141439.020_20260308_141506 Paper: self.20260308141439.020	CPU-Pinned Segmented State Streaming README.md CPU-Pinned Segmented State Streaming Hypothesis LLM inference is fundamentally memory-bound. By treating the SSM (State Space Model) state or KV-Cache as a paged cache and streaming fixed-size segments from CPU RAM, we can effecti...	03-29 08:01	Success	-	View
exp_self.20260308141647.021_20260308_141715 Paper: self.20260308141647.021	Benchmark: Segmented State Recycle with Sliding Window Eviction README.md Benchmark: Segmented State Recycle with Sliding Window Eviction Overview This benchmark evaluates an innovative memory management technique for State Space Models (SSMs) and Attention-based mechanisms. By implementing a segmented...	03-29 08:01	Success	-	View
exp_self.20260308141824.022_20260308_141841 Paper: self.20260308141824.022	Saliency-Triggered CPU Stream Benchmark README.md Saliency-Triggered CPU Stream Benchmark This benchmark evaluates the Saliency-Triggered CPU Stream innovation for State Space Models (SSMs). Hypothesis Deeper layers in SSMs frequently enter low-entropy states where they act m...	03-29 08:01	Success	-	View
exp_self.20260308142120.023_20260308_142148 Paper: self.20260308142120.023	Saliency-Gated Async State Spilling Benchmark README.md Saliency-Gated Async State Spilling Benchmark This repository contains a benchmark implementation for Saliency-Gated Async State Spilling, a technique designed to optimize memory usage in State Space Models (SSMs) like Mamba d...	03-29 08:01	Success	-	View
exp_self.20260308142311.024_20260308_142338 Paper: self.20260308142311.024	Benchmark: SSM + Cache + Memory Co-design README.md Benchmark: SSM + Cache + Memory Co-design Overview This benchmark evaluates a Student Hypothesis regarding the co-design of State Space Models (SSM), efficient Caching strategies, and Dynamic Memory management. Hypothesis:...	03-29 08:01	Success	-	View
exp_self.20260308142448.025_20260308_142516 Paper: self.20260308142448.025	--- Section 1: README.md Benchmark: Entropy-Gated State Skipping Overview This benchmark evaluates the Entropy-Gated State Skipping innovation for Selective State Space Models (SSMs). The core hypothesis is that not every token requires a f...	03-29 08:01	Success	-	View
exp_self.20260308142858.026_20260308_142948 Paper: self.20260308142858.026	Cache-Retrieval Augmented SSM (CRASS) This repository contains the benchmark suite for CRASS (Cache-Retrieval Augmented SSM). Overview CRASS proposes a hybrid architecture where the hidden state of a State Space Model (SSM) is used to explicitly query a Key-Value (KV) cache...	03-29 08:01	Success	-	View
exp_self.20260308143545.027_20260308_143613 Paper: self.20260308143545.027	README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308143924.029_20260308_143945 Paper: self.20260308143924.029	Delta-State Residual Compression README.md Delta-State Residual Compression Hypothesis The state tensor $H_t$ in State Space Models (SSMs) exhibits high temporal correlation ($H_t \approx H_{t-1}$). Storing the full state for every sequence step during generation is redund...	03-29 08:01	Success	-	View
exp_self.20260308144238.030_20260308_144501 Paper: self.20260308144238.030	Magnitude-Adaptive State Quantization (MASQ) Overview This benchmark implements Magnitude-Adaptive State Quantization (MASQ) for State Space Models (SSM). The Innovation Standard SSMs and RNNs maintain a hidden state `h` that is typically stored in full precision (FP32 or FP16). H...	03-29 08:01	Success	-	View
exp_self.20260308144526.031_20260308_144602 Paper: self.20260308144526.031	```markdown bash pip install torch bash python benchmark.py ``` Expected Output The script will output VRAM usage and tokens per second. We expect a significant reduction in VRAM for the Innovation mode (>40%) with a negligible drop in processing speed...	03-29 08:01	Success	-	View
exp_self.20260308144831.032_20260308_144904 Paper: self.20260308144831.032	Delta-State Streaming Benchmark README.md Delta-State Streaming Benchmark Overview This benchmark evaluates Delta-State Streaming, an optimization technique designed to reduce the overhead of CPU-GPU data transfer in State Space Models (SSMs) or large recurrent networ...	03-29 08:01	Success	-	View
exp_self.20260308145053.033_20260308_145129 Paper: self.20260308145053.033	Linear-Sparse Recurrent Cache (LSRC) Benchmark README.md Linear-Sparse Recurrent Cache (LSRC) Benchmark This repository contains the benchmark code for the Linear-Sparse Recurrent Cache (LSRC) innovation. The Innovation State Space Models (SSMs), like Mamba, are excellent at efficie...	03-29 08:01	Success	-	View
exp_self.20260308145231.034_20260308_145254 Paper: self.20260308145231.034	Hierarchical State Cache (CPU-GPU Offload) README.md Hierarchical State Cache (CPU-GPU Offload) Innovation Overview This benchmark demonstrates a Hierarchical State Cache strategy for State Space Models (SSMs). By treating CPU pinned memory as a "Level 2" cache, we decouple the...	03-29 08:01	Success	-	View
exp_self.20260308145611.035_20260308_145633 Paper: self.20260308145611.035	```markdown README.md bash pip install torch tqdm python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308145846.036_20260308_145909 Paper: self.20260308145846.036	Section 1: README.md No summary available yet.	03-29 08:01	Success	-	View
exp_self.20260308150350.038_20260308_150416 Paper: self.20260308150350.038	Section 1: README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308150551.039_20260308_150618 Paper: self.20260308150551.039	CPU-Pinned State Checkpointing (CPSC) README.md CPU-Pinned State Checkpointing (CPSC) Overview This benchmark validates the CPU-Pinned State Checkpointing (CPSC) innovation. The hypothesis is that by offloading SSM (State Space Model) states to CPU pinned memory (system RAM...	03-29 08:01	Success	-	View
exp_self.20260308150904.040_20260308_150932 Paper: self.20260308150904.040	Entropy-Gated State Skipping Benchmark README.md Entropy-Gated State Skipping Benchmark This repository contains a minimal, self-contained benchmark for the Entropy-Gated State Skipping innovation. Hypothesis Tokens with low information density (low entropy) induce minimal c...	03-29 08:01	Success	-	View
exp_self.20260308151038.041_20260308_151119 Paper: self.20260308151038.041	Adaptive State Dimensionality (ASD) Benchmark This benchmark evaluates the Adaptive State Dimensionality (ASD) hypothesis. The core idea is that not all tokens in a sequence require the full state capacity of an SSM (State Space Model). By using a lightweight gating network, we cla...	03-29 08:01	Success	-	View
exp_self.20260308151512.042_20260308_151537 Paper: self.20260308151512.042	Student hypothesis: ssm + cache + memory This repository contains a compact, runnable benchmark designed to test the hypothesis that combining State Space Models (SSM), explicit state caching, and dynamic memory precision can improve throughput and memory efficiency compared to st...	03-29 08:01	Success	-	View
exp_self.20260308151705.043_20260308_151729 Paper: self.20260308151705.043	```markdown bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308151832.044_20260308_151858 Paper: self.20260308151832.044	Innovation: CPU-GPU State Streamer (CGSS) README.md Innovation: CPU-GPU State Streamer (CGSS) Objective: Benchmark the viability of offloading SSM (State Space Model) history states to CPU pinned memory to process sequences longer than GPU VRAM normally allows. Problem Stat...	03-29 08:01	Success	-	View
exp_self.20260308152053.045_20260308_152129 Paper: self.20260308152053.045	Benchmark: Delta-State Cache Compression (DSCC) README.md Benchmark: Delta-State Cache Compression (DSCC) Overview This benchmark implements and tests the Delta-State Cache Compression (DSCC) hypothesis for State Space Models (SSMs), specifically targeting Mamba-like architectures. T...	03-29 08:01	Success	-	View
exp_self.20260308152219.046_20260308_152331 Paper: self.20260308152219.046	```markdown README.md bash python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308152356.047_20260308_152425 Paper: self.20260308152356.047	Sketch-Based SSM History Compression README.md Sketch-Based SSM History Compression Innovation Summary This benchmark validates a novel approach to decoupling context length from VRAM usage in State Space Models (SSMs). By treating the SSM's hidden state trajectory as a stream...	03-29 08:01	Success	-	View
exp_self.20260308153834.048_20260308_153903 Paper: self.20260308153834.048	Student Hypothesis Benchmark: SSM + Cache Co-design README.md Student Hypothesis Benchmark: SSM + Cache Co-design Hypothesis We hypothesize that a co-design combining State Space Models (SSM), Caching mechanisms, and Memory optimization (Dynamic Precision) will significantly impr...	03-29 08:01	Success	-	View
exp_self.20260308154606.049_20260308_154627 Paper: self.20260308154606.049	Sketch-Preconditioned SSM State Overview This benchmark implements a Sketch-Preconditioned State Space Model (SSM). The core hypothesis is that the hidden state $h$ in standard recurrent architectures (like Mamba) is often redundant or low-rank. Instead of maintaining...	03-29 08:01	Success	-	View
exp_self.20260308154740.050_20260308_154808 Paper: self.20260308154740.050	Benchmark: Asynchronous CPU State Streaming for SSMs README.md Benchmark: Asynchronous CPU State Streaming for SSMs Overview This benchmark evaluates a CPU Offload Strategy for State Space Models (SSMs). Specifically, it tests the hypothesis that offloading the recurrent state accumulatio...	03-29 08:01	Success	-	View
exp_self.20260308155358.052_20260308_155601 Paper: self.20260308155358.052	Here is the design for a runnable benchmark based on the hypothesis of SSM + Cache Co-design with Dynamic Precision. Since the original architectural output was empty, this benchmark implements a representative synthetic experiment. It compares a baseline Float32 SSM implementation against an "Optimized" version that utilizes Dynamic Precision (simulating...	03-29 08:01	Success	-	View
exp_self.20260308155643.053_20260308_155717 Paper: self.20260308155643.053	Innovation: Token-Entropy Dynamic Precision for SSMs README.md Innovation: Token-Entropy Dynamic Precision for SSMs Hypothesis Not all tokens require high-precision state updates in State Space Models (SSMs). High-entropy tokens (rare words carrying high information) require FP16 stability to...	03-29 08:01	Success	-	View
exp_self.20260308160924.054_20260308_160958 Paper: self.20260308160924.054	KV-State Hybrid Cache Benchmark README.md KV-State Hybrid Cache Benchmark This repository contains a minimal, runnable benchmark for the KV-State Hybrid Cache architecture innovation. Hypothesis Standard Transformers rely on growing KV-Caches, which consume massive VR...	03-29 08:01	Success	-	View
exp_self.20260308161105.055_20260308_161141 Paper: self.20260308161105.055	Here are the requested files. README.md	03-29 08:01	Success	-	View
exp_self.20260308163619.001_20260308_163648 Paper: self.20260308163619.001	Frequency-Domain State Compression Benchmark README.md This repository contains a benchmark for Frequency-Domain State Compression, a novel technique to optimize memory usage in State Space Models (SSMs). The Innovation SSMs maintain a large internal state tensor that scales with...	03-29 08:01	Pending	-	View
exp_self.20260308165917.001_20260308_170017 Paper: self.20260308165917.001	Entropy-Adaptive State Quantization (EASQ) README.md Entropy-Adaptive State Quantization (EASQ) This benchmark tests the hypothesis that State Space Model (SSM) hidden states can be dynamically quantized to `float8` without significant performance degradation when the model's predic...	03-29 08:01	Success	-	View
exp_self.20260308170232.002_20260308_170309 Paper: self.20260308170232.002	Tiered State Streaming (TSS) Benchmark README.md Tiered State Streaming (TSS) Benchmark Overview This benchmark implements Tiered State Streaming (TSS), a technique designed to overcome VRAM limitations in State Space Models (SSMs) like Mamba. The Innovation Standard SSMs ma...	03-29 08:01	Success	-	View
exp_self.20260308170554.003_20260308_170642 Paper: self.20260308170554.003	Innovation: Entropy-Gated State Quantization README.md Innovation: Entropy-Gated State Quantization Title: Entropy-Gated State Quantization for SSMs Techniques: ssm, dynamic_precision, memory Hypothesis The recurrent state in Selective State Space Models (SSMs) like Mamba cont...	03-29 08:01	Success	-	View
exp_self.20260308171551.001_20260308_171622 Paper: self.20260308171551.001	Student hypothesis: ssm + cache co-design Paper ID: self.20260308171551.001 - Hypothesis: Combining ssm + cache + memory will improve throughput or memory efficiency without breaking 8GB execution. - Plan: Create a compact comparative benchmark against a simple baseline, measure VR...	03-29 08:01	Success	-	View
exp_self.20260308171721.002_20260308_171800 Paper: self.20260308171721.002	Linear-Associative State Injection (LASI) Benchmark README.md Linear-Associative State Injection (LASI) Benchmark This repository contains a minimal, runnable benchmark demonstrating the Linear-Associative State Injection (LASI) concept. Overview Standard State Space Models (SSMs), like...	03-29 08:01	Success	-	View
exp_self.20260308171902.003_20260308_171929 Paper: self.20260308171902.003	Dynamic Entropy State Reset README.md Dynamic Entropy State Reset Innovation: Dynamic Entropy State Reset (SSM) Hypothesis: High entropy in output logits indicates a transition or noise. Using this as a trigger to reset the SSM state will improve stability and...	03-29 08:01	Success	-	View
exp_self.20260308172220.004_20260308_172302 Paper: self.20260308172220.004	Section 1: README.md bash pip install torch numpy python benchmark.py	03-29 08:01	Success	-	View
exp_self.20260308172406.005_20260308_172438 Paper: self.20260308172406.005	Per-Matrix Dynamic Precision Paper ID: self.20260308172406.005 - Hypothesis: The projection matrices (B, C) are more robust to quantization than the state transition matrix (A). Applying aggressive 4-bit quantization only to B/C yields speedups with minimal accuracy lo...	03-29 08:01	Success	-	View
exp_self.20260308172546.006_20260308_172621 Paper: self.20260308172546.006	GLA-2: Hybrid Linear-SSM Gate Benchmark README.md GLA-2: Hybrid Linear-SSM Gate Benchmark This repository implements a benchmark for the GLA-2 (Gated Linear-Attention 2) architecture. This innovation tests the hypothesis that a lightweight, learned gating mechanism can optima...	03-29 08:01	Success	-	View
exp_self.20260308172925.007_20260308_172954 Paper: self.20260308172925.007	Student hypothesis: ssm + cache co-design Paper ID: self.20260308172925.007 - Hypothesis: Combining ssm + cache + memory will improve throughput or memory efficiency without breaking 8GB execution. - Plan: Create a compact comparative benchmark against a simple baseline, measure VR...	03-29 08:01	Success	-	View
exp_self.20260308173100.008_20260308_173150 Paper: self.20260308173100.008	Hierarchical State Space Partitioning (HSSP) Paper ID: self.20260308173100.008 - Hypothesis: The SSM state vector can be segmented into a short-term active window (GPU) and a long-term compressed history (CPU). Transferring only the delta every N steps will maintain perplexity while r...	03-29 08:01	Success	-	View
exp_self.20260308173513.009_20260308_173548 Paper: self.20260308173513.009	Zero-Copy Memory-Mapped State Streaming for SSMs README.md Zero-Copy Memory-Mapped State Streaming for SSMs This repository provides a runnable benchmark for Zero-Copy Memory-Mapped State Streaming. The Innovation Standard State Space Models (SSMs) require maintaining recurrent states...	03-29 08:01	Success	-	View
exp_self.20260308174849.010_20260308_174914 Paper: self.20260308174849.010	Dormant State Offloading (DSO) Benchmark README.md Dormant State Offloading (DSO) Benchmark Innovation: Dormant State Offloading (DSO) Category: Memory Optimization, SSM/Cache Management Hypothesis State Space Models (SSMs) and Transformers processing long contexts (128k+)...	03-29 08:01	Success	-	View
exp_self.20260308175017.011_20260308_175052 Paper: self.20260308175017.011	Cross-Layer State Distillation (CLSD) Benchmark README.md Cross-Layer State Distillation (CLSD) Benchmark Overview This benchmark evaluates the Cross-Layer State Distillation (CLSD) hypothesis. The core idea is to replace a "Deep" stack of sequential State Space Model (SSM) layers wi...	03-29 08:01	Success	-	View
exp_self.20260308175435.012_20260308_175501 Paper: self.20260308175435.012	Asynchronous Host-Device State Ring Buffer README.md Asynchronous Host-Device State Ring Buffer Hypothesis By maintaining a sliding window of 'active' states on GPU and 'dormant' states in pageable/pinned CPU memory, we can theoretically infer infinite context lengths on 8GB GPUs, b...	03-29 08:01	Success	-	View
exp_self.20260308175629.013_20260308_175701 Paper: self.20260308175629.013	Sparse Associative State Injection (SASI) Paper ID: self.20260308175629.013 - Hypothesis: Injecting a k-NN retrieved vector from a running history cache into the SSM input will improve performance on long-context needle-in-haystack tasks without re-training. - Plan: Implement a CPU...	03-29 08:01	Success	-	View
exp_self.20260308175823.014_20260308_175854 Paper: self.20260308175823.014	Prompt-Gated Temporal Decay (PGTD) Benchmark README.md Prompt-Gated Temporal Decay (PGTD) Benchmark This benchmark evaluates the Prompt-Gated Temporal Decay (PGTD) innovation against a standard SSM baseline. Hypothesis Static SSMs often forget early context due to fixed decay rate...	03-29 08:01	Success	-	View
exp_self.20260308180140.015_20260308_180204 Paper: self.20260308180140.015	Entropy-Adaptive State Quantization Benchmark README.md Entropy-Adaptive State Quantization Benchmark This benchmark evaluates a novel Dynamic Precision State Space Model (SSM) wrapper. The core hypothesis is that memory bandwidth and compute can be optimized by adjusting the numer...	03-29 08:01	Success	-	View
exp_self.20260308180341.016_20260308_180415 Paper: self.20260308180341.016	Sparse Associative State (SAS) Benchmark README.md Sparse Associative State (SAS) Benchmark This benchmark validates the Sparse Associative State (SAS) hypothesis, which proposes that dense State Space Model (SSM) states can be optimized for long-context tasks by offloading "d...	03-29 08:01	Success	-	View
exp_self.20260308180540.017_20260308_180605 Paper: self.20260308180540.017	Benchmark: SSM + Cache + Dynamic Precision Co-design README.md Benchmark: SSM + Cache + Dynamic Precision Co-design This benchmark investigates the hypothesis that integrating State Space Models (SSM), optimized Caching strategies, and Dynamic Precision (AMP) can yield better memo...	03-29 08:01	Success	-	View
exp_pytrain.20260329075911.001_20260329_075929 Paper: pytrain.20260329075911.001	Dynamic Entry Point Dispatcher This benchmark tests the efficiency and robustness of a dynamic plugin system using Python's `typing.Protocol`. The design simulates an entry-point based architecture where concrete classes are registered, validated against a structural int...	03-29 08:00	Success	-	View
exp_pytrain.20260327104617.001_20260327_104619 Paper: pytrain.20260327104617.001	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-27 10:46	Success	-	View
exp_pytrain.20260326135218.064_20260326_135239 Paper: pytrain.20260326135218.064	Python Skill Fallback Title: Dynamic Plugin Loader with Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 13:52	Success	-	View
exp_pytrain.20260326132907.063_20260326_132939 Paper: pytrain.20260326132907.063	Python Skill Fallback Title: Dynamic ZipApp Construction and Runtime Type Verification - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 13:29	Success	-	View
exp_pytrain.20260326130903.062_20260326_130924 Paper: pytrain.20260326130903.062	Generic Plugin Registry with PEP 695 Syntax Overview This benchmark demonstrates the use of PEP 695 Type Parameter Syntax (available in Python 3.12+) to create a robust, type-safe Generic Plugin Registry. Key Features 1. Type Parameters (PEP 695): Uses the new `class ClassNam...	03-26 13:09	Success	-	View
exp_pytrain.20260326124943.061_20260326_125019 Paper: pytrain.20260326124943.061	Python Skill Fallback Title: Dynamic Type-Safe Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 12:50	Success	-	View
exp_pytrain.20260326122844.060_20260326_122920 Paper: pytrain.20260326122844.060	Python Skill Fallback Title: Strictly-Typed Generic Registry for Distributed Configs - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 12:29	Success	-	View
exp_pytrain.20260326120655.059_20260326_120725 Paper: pytrain.20260326120655.059	Python Skill Fallback Title: Generic Component Registry with Runtime Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 12:07	Success	-	View
exp_pytrain.20260326114330.058_20260326_114355 Paper: pytrain.20260326114330.058	Strict-Typed Virtual Module Loader This coding drill validates the ability to programmatically construct Python modules in memory using the `types` and `importlib` standard libraries, while enforcing strict behavioral contracts using `typing.Protocol`. Overview The script im...	03-26 11:44	Success	-	View
exp_pytrain.20260326112128.057_20260326_112159 Paper: pytrain.20260326112128.057	Type-Safe Dynamic Plugin Loader Benchmark This benchmark tests a Python environment's ability to dynamically generate a package structure, load modules at runtime using `importlib`, and strictly validate their interfaces using modern static typing features (`typing.Protocol` and `@...	03-26 11:22	Success	-	View
exp_pytrain.20260326105908.056_20260326_105930 Paper: pytrain.20260326105908.056	Protocol-Based Extensible Data Ingestion Framework This coding drill focuses on advanced type hinting features in Python, specifically `typing.Protocol` for structural subtyping and `typing.Generic` for creating reusable, type-safe components. Objective Implement a generic data ingestion an...	03-26 10:59	Success	-	View
exp_pytrain.20260326103622.055_20260326_103655 Paper: pytrain.20260326103622.055	Python Skill Fallback Title: Type-Safe Generic Event Dispatcher - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 10:36	Success	-	View
exp_pytrain.20260326101135.054_20260326_101209 Paper: pytrain.20260326101135.054	Python Skill Fallback Title: Strict Generic Registry for Extensible Packages - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 10:12	Success	-	View
exp_pytrain.20260326094816.053_20260326_094840 Paper: pytrain.20260326094816.053	```python README.md Robust Plugin Loader with Runtime Type Validation Objective This benchmark tests your ability to construct a secure, dynamic plugin loading system using Python's standard library. The system must enforce strict interface contracts...	03-26 09:48	Success	-	View
exp_pytrain.20260326092327.052_20260326_092413 Paper: pytrain.20260326092327.052	Python Skill Fallback Title: Strictly Typed Dynamic Component Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 09:24	Success	-	View
exp_pytrain.20260326085929.051_20260326_090005 Paper: pytrain.20260326085929.051	Type-Safe Data Serializer and CLI Tool This benchmark implements a robust, type-safe serialization library and command-line interface within a single Python file. Overview The `benchmark.py` script serves a dual purpose: 1. Library: It acts as an importable module providing...	03-26 09:00	Success	-	View
exp_pytrain.20260326083042.050_20260326_083124 Paper: pytrain.20260326083042.050	Dynamic Plugin Loader with Type Safety Hypothesis A robust system relies on strict interfaces and dynamic discovery mechanisms rather than hard-coded dependencies. By combining `typing.Protocol` with `importlib`, developers can create extensible architectures that fail predictab...	03-26 08:31	Success	-	View
exp_pytrain.20260326075836.049_20260326_075921 Paper: pytrain.20260326075836.049	Dynamic Plugin Packaging and Type Verification This benchmark tests a Python system's ability to dynamically generate, package, and verify source code at runtime. Scenario The system must act as an autonomous plugin manager. It defines a strict Protocol (`DataProcessor`) that expect...	03-26 07:59	Success	-	View
exp_pytrain.20260326073144.048_20260326_073209 Paper: pytrain.20260326073144.048	Python Skill Fallback Title: Runtime-Validated Package Scaffolder with Modern Generics - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 07:32	Success	-	View
exp_pytrain.20260326070337.047_20260326_070425 Paper: pytrain.20260326070337.047	Type-Safe Dynamic Module Loader Benchmark Overview This coding drill tests the ability to construct a robust, zero-dependency plugin architecture using Python's standard library. The focus is on strict interface enforcement using `typing.Protocol` and the dynamic loading of modules...	03-26 07:04	Success	-	View
exp_pytrain.20260326063939.046_20260326_064013 Paper: pytrain.20260326063939.046	Dynamic Plugin Registry with Virtual Package Simulation Overview This benchmark tests your ability to construct a robust, type-safe plugin architecture similar to those found in high-performance ML frameworks like `vLLM` or `Diffusers`. It requires creating a virtual package namespace at runtime...	03-26 06:40	Success	-	View
exp_pytrain.20260326061608.045_20260326_061642 Paper: pytrain.20260326061608.045	PEP 621 Metadata Validator and Version Syncer Overview This benchmark implements a robust, static analysis tool to ensure build integrity by synchronizing version information between a package's source code (`__init__.py`) and its build metadata (`pyproject.toml`). The Hypothesis Confi...	03-26 06:16	Success	-	View
exp_pytrain.20260326053307.044_20260326_053344 Paper: pytrain.20260326053307.044	Dynamic Type-Safe Plugin Registry Benchmark This benchmark validates the implementation of a dynamic plugin system that combines runtime module discovery with static type checking using Python's `typing.Protocol`. Objective The goal is to implement a `PluginRegistry` that can: 1. Dyn...	03-26 05:33	Success	-	View
exp_pytrain.20260326050255.043_20260326_050325 Paper: pytrain.20260326050255.043	Typed Plugin System with Package Simulation Overview This benchmark challenges the implementation of a robust, type-safe plugin architecture within a single Python file. It simulates a micro-package environment using standard library features, focusing on `typing.Protocol` for struct...	03-26 05:03	Success	-	View
exp_pytrain.20260326043055.042_20260326_043149 Paper: pytrain.20260326043055.042	Generic Entity Repository with PEP 695 Syntax This benchmark tests the implementation of a generic in-memory repository using Python 3.12+ features. It validates the use of PEP 695 Type Parameter Syntax (introducing type parameters using square brackets) and the new `type` statemen...	03-26 04:31	Success	-	View
exp_pytrain.20260326035513.041_20260326_035556 Paper: pytrain.20260326035513.041	Python Skill Fallback Title: Dynamic Module Loader with Structural Subtyping - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 03:55	Success	-	View
exp_pytrain.20260326031951.040_20260326_032043 Paper: pytrain.20260326031951.040	Benchmark: Typed Plugin Registry with Strict Packaging Hygiene This coding drill evaluates your ability to design a robust, modular library architecture within a single file. You must leverage Python's advanced typing features (Protocols, Generics) to enforce interface contracts and implement strict pa...	03-26 03:20	Success	-	View
exp_pytrain.20260326023831.039_20260326_023924 Paper: pytrain.20260326023831.039	Strictly Typed Component Registry with CLI Simulation This benchmark tests the ability to construct a zero-dependency, type-safe plugin registry and command-line interface (CLI) dispatcher, mimicking the architectural patterns found in major ML libraries like Hugging Face Transformers. Problem...	03-26 02:39	Success	-	View
exp_pytrain.20260326020413.038_20260326_020438 Paper: pytrain.20260326020413.038	Dynamic Type-Checked Plugin Loader Objective Design a Python system that bridges static type safety with dynamic runtime execution. The goal is to define a strictly typed generic interface using `typing.Protocol` and `TypeVar`, programmatically generate a Python package in a...	03-26 02:04	Success	-	View
exp_pytrain.20260326012417.037_20260326_012508 Paper: pytrain.20260326012417.037	Python Skill Fallback Title: Runtime Type-Checked Package Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-26 01:25	Success	-	View
exp_pytrain.20260326005036.036_20260326_005159 Paper: pytrain.20260326005036.036	Metadata-Aware Plugin Loader This benchmark challenges you to implement a robust, type-safe plugin architecture using Python's standard library. The system must dynamically discover a "third-party" plugin package using `importlib.metadata` and verify its compliance wit...	03-26 00:52	Success	-	View
exp_pytrain.20260326001549.035_20260326_001633 Paper: pytrain.20260326001549.035	PEP 695 Generic Pipeline Processor Benchmark This benchmark tests your ability to utilize modern Python 3.12+ type hinting features (PEP 695) to build a robust, type-safe data processing pipeline. It validates the new Type Parameter Syntax for classes and type aliases, eliminating the...	03-26 00:16	Success	-	View
exp_pytrain.20260325234909.034_20260325_234936 Paper: pytrain.20260325234909.034	Strictly-Typed Modular Configuration Registry This benchmark evaluates the implementation of a robust, library-grade configuration system using Python's advanced type hinting features. The solution must simulate a core component of a large-scale application (similar to LitGPT), enforci...	03-25 23:49	Success	-	View
exp_pytrain.20260325232603.033_20260325_232640 Paper: pytrain.20260325232603.033	Strict Configuration Validator Benchmark This benchmark evaluates a high-performance, zero-dependency configuration validation engine designed for production-grade Python applications. It utilizes advanced metaprogramming with `typing` and `dataclasses` to enforce strict schema co...	03-25 23:26	Success	-	View
exp_pytrain.20260325230200.032_20260325_230228 Paper: pytrain.20260325230200.032	Generic Plugin Registry with Protocol-Based Constraints Description This benchmark implements a robust, modular plugin registry system using Python's `typing.Protocol` and `typing.TypeVar`. It mimics architectural patterns found in large-scale frameworks (like Hugging Face Transformers or Diffus...	03-25 23:02	Success	-	View
exp_pytrain.20260325223351.031_20260325_223418 Paper: pytrain.20260325223351.031	Python Skill Fallback Title: Protocol-Based Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 22:34	Success	-	View
exp_pytrain.20260325220212.030_20260325_220242 Paper: pytrain.20260325220212.030	Python Skill Fallback Title: Metadata-Aware Secure Source Archiver - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 22:02	Success	-	View
exp_pytrain.20260325213137.029_20260325_213231 Paper: pytrain.20260325213137.029	Static Package Metadata and Type-Strictness Verifier This benchmark implements a CLI verification tool designed to statically analyze Python package structures. It enforces code quality standards by parsing Abstract Syntax Trees (AST) without executing the target code, ensuring safety and sid...	03-25 21:32	Success	-	View
exp_pytrain.20260325203954.028_20260325_204021 Paper: pytrain.20260325203954.028	Benchmark: PEP 695 Generic Plugin Registry This benchmark evaluates the implementation of a type-safe plugin registry system using Python 3.12+ Type Parameter Syntax (PEP 695). Objectives 1. Modern Syntax: Utilize the new class-based type parameter syntax (e.g., `class Regis...	03-25 20:40	Success	-	View
exp_pytrain.20260325201424.027_20260325_201454 Paper: pytrain.20260325201424.027	Strictly Typed Autograd System with Protocol Contracts Design Brief This coding drill validates the hypothesis that an autonomous system can produce robust, maintainable code by implementing a simplified Automatic Differentiation (autograd) engine. The implementation must leverage Python's type...	03-25 20:14	Success	-	View
exp_pytrain.20260325195258.026_20260325_195320 Paper: pytrain.20260325195258.026	Type-Safe Configuration & Dynamic Plugin Dispatcher Benchmark Overview This benchmark evaluates the ability to construct a robust, modular Python architecture using the standard library (`typing`, `dataclasses`, `importlib`). It simulates a simplified Machine Learning inference framework where the exe...	03-25 19:53	Success	-	View
exp_pytrain.20260325193156.025_20260325_193221 Paper: pytrain.20260325193156.025	Python Skill Fallback Title: Typed Event Dispatcher with Module Hygiene - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 19:32	Success	-	View
exp_pytrain.20260325191221.024_20260325_191246 Paper: pytrain.20260325191221.024	Strictly Typed Modular Log Processor Overview This coding drill benchmark evaluates your ability to construct a robust, multi-module Python package that strictly enforces type safety and adheres to PEP 8 standards. Objective Create a Python package named `logtools` containing:...	03-25 19:12	Success	-	View
exp_pytrain.20260325185336.023_20260325_185406 Paper: pytrain.20260325185336.023	In-Memory Zip Loader with Protocol Enforcement This coding drill demonstrates a robust method for creating, packaging, and enforcing strict structural typing (Protocol) for Python plugins dynamically loaded from a Zip archive, without persisting files to disk (using temporary files). Ob...	03-25 18:54	Success	-	View
exp_pytrain.20260325183432.022_20260325_183453 Paper: pytrain.20260325183432.022	PEP 695 Generic Registry & Introspection Benchmark This benchmark evaluates the implementation of a thread-safe generic registry utilizing PEP 695 Type Parameter Syntax (introduced in Python 3.12). It validates the reduction of boilerplate code and verifies module introspection capabili...	03-25 18:34	Success	-	View
exp_pytrain.20260325181145.021_20260325_181208 Paper: pytrain.20260325181145.021	Structural Plugin Loader Benchmark This benchmark evaluates the ability to construct a robust, decoupled plugin architecture using Python's `importlib` for dynamic discovery and `typing.Protocol` for structural interface validation. Objective Create a standalone system that...	03-25 18:12	Success	-	View
exp_pytrain.20260325175112.020_20260325_175144 Paper: pytrain.20260325175112.020	Dynamic Backend Registry with Runtime Type Verification This benchmark demonstrates the creation of a robust, modular plugin system using Python's standard library. It simulates a high-performance computing environment (similar to ML frameworks like PyTorch or Lightning) where backend implementa...	03-25 17:51	Success	-	View
exp_pytrain.20260325173055.019_20260325_173125 Paper: pytrain.20260325173055.019	Dynamic Namespace Package Injection & Runtime Type Verification Overview This benchmark tests the ability to implement a robust, runtime-safe plugin loader using Python's standard library. The solution must dynamically create a namespace package from a string source, inject it into the runtime path, and...	03-25 17:31	Success	-	View
exp_pytrain.20260325170955.018_20260325_171018 Paper: pytrain.20260325170955.018	Dynamic Plugin Inspector with Type Guarantees This benchmark tests the ability to write a robust, type-safe Python utility for runtime package introspection. It utilizes `importlib.metadata` to inspect installed distributions and enforces strict data structures using `typing.TypedDict`...	03-25 17:10	Success	-	View
exp_pytrain.20260325164801.017_20260325_164829 Paper: pytrain.20260325164801.017	Type-Safe Plugin Loader with Runtime Validation This benchmark evaluates the ability to design a robust dynamic plugin system using Python's standard library. Objective Create a `PluginLoader` class that dynamically discovers, imports, and validates Python modules from a temporary direct...	03-25 16:48	Success	-	View
exp_pytrain.20260325162708.016_20260325_162736 Paper: pytrain.20260325162708.016	Python Skill Fallback Title: Generic Typed Pipeline and CLI Interface - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 16:27	Success	-	View
exp_pytrain.20260325160606.015_20260325_160638 Paper: pytrain.20260325160606.015	Modern Generic Plugin Registry - PEP 695 Benchmark This benchmark validates the hypothesis that utilizing PEP 695 Type Parameter Syntax significantly reduces the boilerplate associated with defining generic containers while enforcing stricter interface adherence via *PEP 484 Protocols...	03-25 16:06	Success	-	View
exp_pytrain.20260325154529.014_20260325_154559 Paper: pytrain.20260325154529.014	Dynamic 'Plugin' Registry with Type-Safe Packaging This benchmark evaluates a Python engineer's ability to implement a modular, type-safe plugin system using advanced standard library features. Objective Construct a runtime environment that dynamically discovers, loads, and validates "plugi...	03-25 15:46	Success	-	View
exp_pytrain.20260325152529.013_20260325_152555 Paper: pytrain.20260325152529.013	Dynamic Module Construction & Type Validation Benchmark This benchmark evaluates the ability to construct Python modules dynamically at runtime using `types.ModuleType` and `sys.modules`, and to rigorously validate the generated components against strict `typing.Protocol` definitions. Scenario T...	03-25 15:25	Success	-	View
exp_pytrain.20260325150542.012_20260325_150609 Paper: pytrain.20260325150542.012	Dynamic Component Registry and Generic Loader This benchmark demonstrates the construction of a robust plugin architecture using Python's standard library. It mimics the `AutoModel` pattern found in major ML frameworks (like Hugging Face Transformers) by leveraging `inspect` and `typin...	03-25 15:06	Success	-	View
exp_pytrain.20260325144750.011_20260325_144752 Paper: pytrain.20260325144750.011	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 14:47	Success	-	View
exp_pytrain.20260325144141.010_20260325_144142 Paper: pytrain.20260325144141.010	Python Skill Fallback Title: Python reliability drill: typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 14:41	Success	-	View
exp_pytrain.20260325142735.009_20260325_142804 Paper: pytrain.20260325142735.009	Python Skill Fallback Title: Runtime Type-Safe Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 14:28	Success	-	View
exp_pytrain.20260325140547.008_20260325_140630 Paper: pytrain.20260325140547.008	Python Skill Fallback Title: Robust PEP 440 Version Resolver with Generic Constraints - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 14:06	Success	-	View
exp_pytrain.20260325134544.007_20260325_134607 Paper: pytrain.20260325134544.007	Strictly-Typed Configuration Resolver Benchmark This benchmark validates a Python module (`benchmark.py`) that implements a strict configuration schema for tensor initialization using Python's `typing` module. Goals 1. Structure: Implement a module compliant with packaging standards...	03-25 13:46	Success	-	View
exp_pytrain.20260325132404.006_20260325_132436 Paper: pytrain.20260325132404.006	Python Skill Fallback Title: Typed Plugin Registry with Semantic Versioning - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 13:24	Success	-	View
exp_pytrain.20260325130330.005_20260325_130352 Paper: pytrain.20260325130330.005	Strictly Typed Dependency Injection Container Benchmark This benchmark implements a robust Dependency Injection (DI) container using Python's standard library. It demonstrates the use of `typing.Protocol` for interface definition and `inspect.signature` for automatic dependency resolution (auto-...	03-25 13:03	Success	-	View
exp_pytrain.20260325123738.004_20260325_123810 Paper: pytrain.20260325123738.004	Strictly Typed Plugin System CLI This benchmark demonstrates the implementation of a strictly typed, architectural CLI using Python's `typing.Protocol`, `TypedDict`, and `argparse`. It simulates a plugin system where components are decoupled via structural subtyping (proto...	03-25 12:38	Success	-	View
exp_pytrain.20260325121326.003_20260325_121424 Paper: pytrain.20260325121326.003	Robust Async Micro-Service Skeleton Benchmark This benchmark validates the implementation of a robust, asynchronous Python micro-service skeleton. It tests the developer's ability to structure a Python application simulating a package layout, utilizing strict type hints (`typing.Protoc...	03-25 12:14	Success	-	View
exp_pytrain.20260325114709.002_20260325_114736 Paper: pytrain.20260325114709.002	Python Skill Fallback Title: PEP 695 Type-Safe Command Dispatcher - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 11:47	Success	-	View
exp_pytrain.20260325112102.001_20260325_112127 Paper: pytrain.20260325112102.001	Python Skill Fallback Title: Strictly Typed Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 11:21	Success	-	View
exp_pytrain.20260325104914.001_20260325_104946 Paper: pytrain.20260325104914.001	Python Skill Fallback Title: Structural Subtyping for Package Entry Points - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-25 10:49	Success	-	View
exp_pytrain.20260324102734.004_20260324_102819 Paper: pytrain.20260324102734.004	Strictly Typed Event Dispatcher Module Overview This coding drill benchmarks a strictly typed, modular Event Dispatcher system designed with Python's `typing.Protocol` and `typing.Generic` features. Architecture The solution implements a Type-Safe Observer Pattern. 1. **`Eve...	03-24 10:28	Success	-	View
exp_pytrain.20260324095427.003_20260324_095516 Paper: pytrain.20260324095427.003	Protocol-Based Dynamic Extension Loader Objective This benchmark tests a Python system's ability to simulate a robust, heterogeneous plugin architecture. It demonstrates the creation of a strict type-safe interface using `typing.Protocol`, dynamic discovery of modules using `impo...	03-24 09:55	Success	-	View
exp_pytrain.20260324092754.002_20260324_092822 Paper: pytrain.20260324092754.002	Python Skill Fallback Title: Modern Generic Result Monad & Module API Design - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-24 09:28	Success	-	View
exp_pytrain.20260324091447.001_20260324_091555 Paper: pytrain.20260324091447.001	Python Skill Fallback Title: Dynamic Entrypoint Loader with Structural Typing - Focus: typing.Protocol, typing.runtime_checkable, typing.Annotated, importlib, packaging - Note: Generated fallback due to unavailable model output.	03-24 09:15	Success	-	View
exp_pytrain.20260318102029.002_20260318_102057 Paper: pytrain.20260318102029.002	Generic Result Wrapper with PEP 695 This benchmark demonstrates the implementation of a robust, Rust-like `Result` type utilizing Python 3.12's PEP 695 Type Parameter Syntax. Features - PEP 695 Syntax: Uses the new `class MyClass[T]:` syntax, removing the need for explici...	03-18 10:21	Success	-	View
exp_pytrain.20260318095243.001_20260318_095341 Paper: pytrain.20260318095243.001	Strictly-Typed Modular Data Pipeline Benchmark This benchmark evaluates a Python implementation of a modular data processing pipeline. The architecture prioritizes Structural Subtyping (Protocols) over nominal inheritance, ensuring that components are interchangeable based on their...	03-18 09:53	Success	-	View
exp_pytrain.20260316152436.002_20260316_152457 Paper: pytrain.20260316152436.002	Type-Safe Generic Cache (PEP 695) This benchmark tests your ability to utilize PEP 695 Type Parameter Syntax (introduced in Python 3.12). The Challenge Modern Python allows you to define generic classes using the syntax `class MyClass[T]:`, removing the need for `TypeVa...	03-16 15:25	Success	-	View
exp_pytrain.20260316150232.001_20260316_150252 Paper: pytrain.20260316150232.001	MiniPlugin: Strictly Typed Modular Plugin System Overview This benchmark demonstrates the implementation of a robust, single-file Python package named `MiniPlugin`. It showcases advanced Python features including Generic Protocols, TypeVars, and strict runtime type checking enforcement wi...	03-16 15:02	Success	-	View
exp_pytrain.20260316142805.005_20260316_142858 Paper: pytrain.20260316142805.005	Dynamic Namespace Packaging and Runtime Protocol Verification This benchmark tests an autonomous agent's ability to programmatically construct a Python namespace package on a virtual file system, perform dynamic module loading using `importlib`, and enforce runtime interface contracts using `typing.Pr...	03-16 14:29	Success	-	View
exp_pytrain.20260316140324.004_20260316_140411 Paper: pytrain.20260316140324.004	Python Skill Fallback Title: Type-Safe Plugin System with Packaging Hygiene - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-16 14:04	Success	-	View
exp_pytrain.20260316134142.003_20260316_134220 Paper: pytrain.20260316134142.003	Python Skill Fallback Title: Strictly Typed Modular Data Pipeline - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-16 13:42	Success	-	View
exp_pytrain.20260316131743.002_20260316_131835 Paper: pytrain.20260316131743.002	Generic Versioned Registry using PEP 695 This benchmark tests the implementation of a type-safe, generic registry for versioned software artifacts using Python 3.12's Type Parameter Syntax (PEP 695). Objectives 1. Demonstrate the reduction of boilerplate code using the new generic...	03-16 13:18	Success	-	View
exp_pytrain.20260316124809.001_20260316_124836 Paper: pytrain.20260316124809.001	Dynamic Package Construction and Protocol Validation Overview This benchmark evaluates the system's ability to programmatically construct Python package structures at runtime and validate type safety using `typing.Protocol`. Tasks 1. Protocol Definition: Define a `DataPlugin` protocol req...	03-16 12:48	Success	-	View
exp_pytrain.20260316122337.002_20260316_122404 Paper: pytrain.20260316122337.002	Generic Repository Pattern with PEP 695 Type Parameters Overview This benchmark validates the implementation of a Generic Repository Pattern utilizing the PEP 695 Type Parameter Syntax introduced in Python 3.12. The objective is to demonstrate a clean, maintainable architecture by levera...	03-16 12:24	Success	-	View
exp_pytrain.20260316115901.001_20260316_115932 Paper: pytrain.20260316115901.001	Python Skill Fallback Title: Implementation of a Strictly-Typed In-Memory Package Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-16 11:59	Success	-	View
exp_pytrain.20260316100558.004_20260316_100640 Paper: pytrain.20260316100558.004	Python Skill Fallback Title: Strictly-Typed Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-16 10:06	Success	-	View
exp_pytrain.20260316093508.003_20260316_093529 Paper: pytrain.20260316093508.003	Strict Zip-App Bundler with Runtime Type Validation This benchmark tests the ability to engineer a robust code packaging pipeline. The script implements a `StrictBundler` class that enforces code quality standards by inspecting Python source files, ensuring type hint coverage using the `typi...	03-16 09:35	Success	-	View
exp_pytrain.20260316090922.002_20260316_090959 Paper: pytrain.20260316090922.002	Python Skill Fallback Title: PEP 695 Generic Plugin Registry with Importlib Introspection - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-16 09:10	Success	-	View
exp_pytrain.20260316084207.001_20260316_084229 Paper: pytrain.20260316084207.001	Python Skill Fallback Title: Strictly Typed Plugin System with Entry-Point Simulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-16 08:42	Success	-	View
exp_pytrain.20260315163655.006_20260315_163719 Paper: pytrain.20260315163655.006	Dynamic Type-Checked Plugin Loader Benchmark Overview This benchmark validates the robustness of a modular autonomous system component by simulating the dynamic loading of a computation engine (plugin). It enforces strict Protocol compliance using Python's `typing` module and vali...	03-15 16:37	Pending	-	View
exp_self.20260315162309.008_20260315_162330 Paper: self.20260315162309.008	Self-directed benchmark: SSM Strategy Stress Test This repository contains a micro-benchmark designed to evaluate the efficacy of a Disciplined Memory Policy within State Space Models (SSMs). Hypothesis Applying an SSM with a disciplined memory policy (fixed-size state recurrence) sign...	03-15 16:34	Success	-	View
exp_self.20260315162013.007_20260315_162046 Paper: self.20260315162013.007	SSM Strategy Stress Test Benchmark This benchmark evaluates the performance impact of applying a disciplined memory policy to State Space Models (SSMs) when operating under strict VRAM constraints (8GB). Hypothesis Applying SSMs with a disciplined memory policy (chunked recu...	03-15 16:20	Success	-	View
exp_pytrain.20260315161708.005_20260315_161732 Paper: pytrain.20260315161708.005	Type-Safe Dynamic Plugin Loader Objective This benchmark evaluates the implementation of a robust, type-safe plugin discovery system using Python's standard library. It tests proficiency in dynamic code loading (`importlib`) and Structural Sub-typing (`typing.Protocol`)....	03-15 16:17	Success	-	View
exp_self.20260315161500.006_20260315_161525 Paper: self.20260315161500.006	Self-Directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the memory efficiency and throughput of a State Space Model (SSM) strategy compared to a standard quadratic attention mechanism under constrained resources. The hypothesis posits that a disciplined memory policy (co...	03-15 16:15	Success	-	View
exp_self.20260315161215.005_20260315_161243 Paper: self.20260315161215.005	Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that a disciplined memory policy within a State Space Model (SSM) architecture improves throughput and reduces VRAM footprint compared to a naive accumulation baseline. Requirements - Python 3.8+ - Py...	03-15 16:12	Success	-	View
exp_pytrain.20260315160916.004_20260315_160958 Paper: pytrain.20260315160916.004	Python Skill Fallback Title: Typed CSV Data Pipeline Module - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 16:10	Success	-	View
exp_self.20260315160603.004_20260315_160637 Paper: self.20260315160603.004	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates a "Memory-Disciplined" State Space Model (SSM) strategy against a standard naive implementation. The hypothesis is that an SSM approach, which explicitly manages state history rather than materializing the entire at...	03-15 16:07	Success	-	View
exp_self.20260315160247.003_20260315_160337 Paper: self.20260315160247.003	Benchmark: SSM Strategy Stress Test This repository contains a lightweight, runnable benchmark designed to test the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput and efficiency under strict VRAM constraints (<8GB). Hyp...	03-15 16:03	Success	-	View
exp_pytrain.20260315155952.003_20260315_160025 Paper: pytrain.20260315155952.003	Strictly-Typed Plugin Registry with Runtime Validation This coding drill implements a strictly-typed Plugin System using Python's `typing.Protocol` and the `@runtime_checkable` decorator. Unlike traditional Abstract Base Classes (ABCs) that rely on inheritance, this approach uses Structural Sub...	03-15 16:00	Success	-	View
exp_self.20260315155616.002_20260315_155643 Paper: self.20260315155616.002	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315155616.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 15:56	Success	-	View
exp_pytrain.20260315155252.002_20260315_155322 Paper: pytrain.20260315155252.002	Modern Generic Stack with Module Encapsulation Objective This benchmark evaluates the implementation of a Python 3.12 generic stack class utilizing the new PEP 695 Type Parameter Syntax. It tests adherence to modern module packaging standards, including strict API definition via `__...	03-15 15:53	Success	-	View
exp_self.20260315154412.001_20260315_154439 Paper: self.20260315154412.001	Self-directed benchmark: SSM strategy stress test Overview This benchmark evaluates the memory efficiency and throughput of a State Space Model (SSM) strategy compared to a standard Transformer baseline. The innovation hypothesis is that an SSM with a disciplined memory policy (using recur...	03-15 15:51	Success	-	View
exp_pytrain.20260315154100.001_20260315_154127 Paper: pytrain.20260315154100.001	Dynamic Package Builder with Runtime Type Verification Hypothesis: An autonomous coding system can utilize Python's standard library to programmatically construct a valid package namespace and enforce strict type safety (Generics and Protocols) at runtime without relying on external static...	03-15 15:41	Success	-	View
exp_self.20260315153603.032_20260315_153636 Paper: self.20260315153603.032	SSM Strategy Stress Test: Memory Disciplined Benchmark This benchmark evaluates the performance of a State Space Model (SSM) inference strategy under strict memory constraints (simulating an 8GB VRAM environment). Hypothesis Applying an SSM with a disciplined memory policy (chunking + precision...	03-15 15:36	Success	-	View
exp_pytrain.20260315153303.017_20260315_153337 Paper: pytrain.20260315153303.017	Dynamic Extension Loader with Protocol Validation Problem Statement Modern Python plugin architectures require a mechanism to load code at runtime (dynamic packaging) while guaranteeing that the loaded code adheres to specific contracts (typing/protocols). Without strict runtime validation...	03-15 15:33	Success	-	View
exp_self.20260315153058.031_20260315_153118 Paper: self.20260315153058.031	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315153058.031 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 15:31	Success	-	View
exp_self.20260315152802.030_20260315_152828 Paper: self.20260315152802.030	Self-directed benchmark: ssm strategy stress test This repository contains a benchmark designed to test the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to standard Transformer Attention mechanis...	03-15 15:28	Success	-	View
exp_pytrain.20260315152518.016_20260315_152545 Paper: pytrain.20260315152518.016	Python Skill Fallback Title: Strictly Typed CSV Data Ingestion Module - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 15:25	Success	-	View
exp_self.20260315152311.029_20260315_152337 Paper: self.20260315152311.029	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315152311.029 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 15:23	Success	-	View
exp_self.20260315152039.028_20260315_152105 Paper: self.20260315152039.028	SSM Strategy Stress Test Benchmark This benchmark evaluates the impact of a "disciplined memory policy" on State Space Model (SSM) inference throughput under tight VRAM constraints. Hypothesis Applying an SSM recurrence strategy with explicit chunking and state management ma...	03-15 15:21	Success	-	View
exp_pytrain.20260315151735.015_20260315_151806 Paper: pytrain.20260315151735.015	Python Skill Fallback Title: Typed Configuration Factory using PEP 695 - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 15:18	Success	-	View
exp_self.20260315151528.027_20260315_151559 Paper: self.20260315151528.027	Self-Directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that a disciplined memory policy (specifically chunked processing and dynamic precision) applied to State Space Models (SSM) improves throughput under strict memory constraints (target < 8GB VRAM). Hy...	03-15 15:16	Success	-	View
exp_self.20260315151223.026_20260315_151247 Paper: self.20260315151223.026	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput under constrained VRAM environments (8GB limit). It contrasts a standard Attention-based block against a si...	03-15 15:13	Success	-	View
exp_pytrain.20260315150912.014_20260315_150938 Paper: pytrain.20260315150912.014	Type-Safe Component Registry and Dependency Resolver Benchmark This drill evaluates the developer's ability to construct a robust, zero-dependency component loader using Python's advanced typing features. Objective Design a generic `PluginRegistry` system that manages component lifecycle and dependenci...	03-15 15:09	Success	-	View
exp_self.20260315150704.025_20260315_150732 Paper: self.20260315150704.025	Self-directed benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the performance efficiency of a State Space Model (SSM) strategy compared to a standard Transformer baseline under constrained memory conditions (targeting <8GB VRAM). Hypothesis Applying SSM with a discipl...	03-15 15:07	Success	-	View
exp_self.20260315150402.024_20260315_150434 Paper: self.20260315150402.024	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315150402.024 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 15:04	Success	-	View
exp_pytrain.20260315150114.013_20260315_150146 Paper: pytrain.20260315150114.013	Python Skill Fallback Title: Generic Plugin Registry with Dynamic Imports - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 15:01	Success	-	View
exp_self.20260315145907.023_20260315_145945 Paper: self.20260315145907.023	Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to standard attention mechanisms. Hypothesis SSMs maintain a fixed-size...	03-15 14:59	Success	-	View
exp_self.20260315145606.022_20260315_145641 Paper: self.20260315145606.022	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315145606.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 14:56	Success	-	View
exp_pytrain.20260315145305.012_20260315_145336 Paper: pytrain.20260315145305.012	Strictly-Typed Dynamic Plugin Registry Overview This benchmark demonstrates the implementation of a robust, type-safe plugin architecture using Python's standard `typing` module. It mirrors architectural patterns found in major ML libraries (like Hugging Face Transformers) to en...	03-15 14:53	Success	-	View
exp_self.20260315145048.021_20260315_145120 Paper: self.20260315145048.021	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to standard dense architectures. Methodology We compare two modes of...	03-15 14:51	Success	-	View
exp_self.20260315144746.020_20260315_144808 Paper: self.20260315144746.020	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies—specifically constant-state memory management combined with dynamic precision—improves throughput and stability under strict 8GB VRAM...	03-15 14:48	Success	-	View
exp_pytrain.20260315144432.011_20260315_144456 Paper: pytrain.20260315144432.011	Strictly-Typed Dynamic Plugin Loader Overview This coding drill evaluates the ability to synthesize Python's advanced typing features (Protocols, Generics, Type Guards) with standard library packaging tools (`importlib`). The goal is to create a robust, runtime-extensible arch...	03-15 14:45	Success	-	View
exp_self.20260315144128.019_20260315_144154 Paper: self.20260315144128.019	SSM Strategy Stress Test This benchmark evaluates the hypothesis that State Space Models (SSMs) combined with a disciplined memory policy (dynamic precision and caching) deliver superior throughput compared to standard Transformer-style architectures when o...	03-15 14:42	Success	-	View
exp_pytrain.20260315143821.010_20260315_143851 Paper: pytrain.20260315143821.010	Python Skill Fallback Title: Strictly Typed ZipApp Generator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 14:38	Success	-	View
exp_self.20260315143610.018_20260315_143638 Paper: self.20260315143610.018	SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the hypothesis that a Selective State Space Model (SSM) strategy, combined with a disciplined memory policy, improves throughput under constrained VRAM conditions (e.g., 8GB) compared to a standard Transfor...	03-15 14:36	Success	-	View
exp_self.20260315143335.017_20260315_143358 Paper: self.20260315143335.017	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315143335.017 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 14:34	Success	-	View
exp_pytrain.20260315143036.009_20260315_143101 Paper: pytrain.20260315143036.009	Runtime-Verified Plugin Loader Design Brief This benchmark demonstrates a zero-dependency plugin architecture using Python's `typing.Protocol` for structural subtyping. It simulates a packaging system by programmatically creating virtual modules using `types.ModuleType`,...	03-15 14:31	Success	-	View
exp_self.20260315142813.016_20260315_142846 Paper: self.20260315142813.016	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315142813.016 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 14:28	Success	-	View
exp_self.20260315142506.015_20260315_142532 Paper: self.20260315142506.015	README: SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying an SSM (State Space Model) strategy with a disciplined memory policy significantly improves throughput (tokens/sec) and reduces VRAM usage compared to a naive baseline when operating und...	03-15 14:25	Success	-	View
exp_pytrain.20260315142212.008_20260315_142239 Paper: pytrain.20260315142212.008	Generic Storage Package with Protocol Enforcement This coding drill verifies the ability to design a Python package structure that adheres to modern packaging standards (`src` layout, `pyproject.toml`) and utilizes advanced typing features (`Protocol`, `Generic`, `TypeVar`) to enforce stri...	03-15 14:22	Success	-	View
exp_self.20260315141931.014_20260315_141954 Paper: self.20260315141931.014	Self-directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies (recurrent state management and dynamic precision) improves inference throughput under strict VRAM constraint...	03-15 14:20	Success	-	View
exp_self.20260315141701.013_20260315_141724 Paper: self.20260315141701.013	Self-Directed Benchmark: SSM Strategy Stress Test Innovation Overview This benchmark tests the hypothesis that applying a State Space Model (SSM) approach with a disciplined memory policy improves throughput under strict 8GB VRAM constraints compared to traditional Attention-based caching...	03-15 14:17	Success	-	View
exp_pytrain.20260315141342.007_20260315_141415 Paper: pytrain.20260315141342.007	Type-Generic Plugin Registry with Protocol Enforcement This benchmark demonstrates the construction of a robust, modular plugin system using Python's `typing` module. It enforces structural interfaces via `Protocol` and manages algorithm components using a type-safe `Generic` registry. Features...	03-15 14:14	Success	-	View
exp_self.20260315140035.012_20260315_140105 Paper: self.20260315140035.012	Self-directed benchmark: SSM strategy stress test Overview This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy and dynamic precision improves throughput under tight 8GB VRAM constraints. It compares a standard Attention-b...	03-15 14:11	Success	-	View
exp_self.20260315135754.011_20260315_135817 Paper: self.20260315135754.011	SSM Strategy Stress Test This benchmark compares the memory footprint and throughput of a standard Transformer-style Attention mechanism against a State Space Model (SSM) implementation. Hypothesis: The SSM approach, utilizing a disciplined recurrent memory pol...	03-15 13:58	Success	-	View
exp_pytrain.20260315135459.006_20260315_135524 Paper: pytrain.20260315135459.006	Dynamic Plugin Loader with Runtime Type Verification Objective This benchmark evaluates a system's ability to dynamically construct a Python package environment at runtime, load arbitrary code modules, and strictly enforce interface compliance using Python's `typing.Protocol`. Scenario The sc...	03-15 13:55	Success	-	View
exp_self.20260315135309.010_20260315_135331 Paper: self.20260315135309.010	SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the hypothesis that applying a State Space Model (SSM) strategy with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to a baseline implementation. Hypothesis By leveragin...	03-15 13:53	Success	-	View
exp_self.20260315135021.009_20260315_135046 Paper: self.20260315135021.009	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying a Selective State Space Model (SSM) with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to a standard Transformer baseline. The Innovati...	03-15 13:50	Success	-	View
exp_pytrain.20260315134740.005_20260315_134814 Paper: pytrain.20260315134740.005	Python Skill Fallback Title: Robust Semantic Versioning & Constraint Resolver - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 13:48	Success	-	View
exp_self.20260315134512.008_20260315_134537 Paper: self.20260315134512.008	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315134512.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 13:45	Success	-	View
exp_self.20260315134222.007_20260315_134247 Paper: self.20260315134222.007	SSM Strategy Stress Test This benchmark evaluates the hypothesis that State Space Model (SSM) architectures significantly reduce VRAM usage compared to standard Transformers when processing long sequences under strict memory constraints. Setup We compare two approa...	03-15 13:42	Success	-	View
exp_pytrain.20260315133948.004_20260315_134011 Paper: pytrain.20260315133948.004	Strictly-Typed Modular Log Analyzer Overview This benchmark evaluates the implementation of a `log_analyzer` module that serves as both a reusable library and a standalone script. The design enforces strict static typing (`mypy --strict`), explicit public APIs (`__all__`), an...	03-15 13:40	Success	-	View
exp_self.20260315133740.006_20260315_133804 Paper: self.20260315133740.006	Self-Directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy significantly improves inference throughput under constrained VRAM (8GB limit). The Innovation The proposed strategy com...	03-15 13:38	Success	-	View
exp_self.20260315133448.005_20260315_133518 Paper: self.20260315133448.005	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy (specifically dynamic precision and activation checkpointing) improves throughput and fits within strict VRAM constraints (8GB)...	03-15 13:35	Success	-	View
exp_pytrain.20260315133200.003_20260315_133238 Paper: pytrain.20260315133200.003	Coding Drill: Strictly Typed Dynamic Plugin Loader Hypothesis An autonomous coding system can robustly integrate external functionality by simulating a package environment and enforcing structural subtyping (Protocols) to validate plugin interfaces before execution, thereby preventing runti...	03-15 13:32	Success	-	View
exp_self.20260315133000.004_20260315_133027 Paper: self.20260315133000.004	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) architectures with a disciplined memory policy improves inference throughput under strict 8GB VRAM constraints compared to standard attention-based basel...	03-15 13:30	Success	-	View
exp_self.20260315132656.003_20260315_132729 Paper: self.20260315132656.003	Self-directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (specifically, chunked state management and caching) improves throughput under constrained VRAM environments (<8GB). It compare...	03-15 13:27	Success	-	View
exp_pytrain.20260315132425.002_20260315_132448 Paper: pytrain.20260315132425.002	PEP 695 Generic Dependency Resolver Overview This benchmark validates the implementation of a directed acyclic graph (DAG) dependency resolver using Python 3.12+ Type Parameter Syntax (PEP 695). The goal is to demonstrate the reduction of boilerplate code by utilizing the...	03-15 13:24	Success	-	View
exp_self.20260315132149.002_20260315_132213 Paper: self.20260315132149.002	SSM Strategy Stress Test Benchmark This repository contains a benchmark designed to test the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy and dynamic precision improves throughput under strict 8GB VRAM constraints. Hypo...	03-15 13:22	Success	-	View
exp_self.20260315131817.001_20260315_131856 Paper: self.20260315131817.001	Self-directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the performance of a State Space Model (SSM) simulation under strict memory constraints (8GB limit). It tests the hypothesis that applying Dynamic Precision (Float16) and a disciplined **Cache/Memory Po...	03-15 13:18	Success	-	View
exp_pytrain.20260315131524.001_20260315_131548 Paper: pytrain.20260315131524.001	Typing-Driven Dynamic Plugin Loader This benchmark validates a Python architecture that enforces strict interface contracts at runtime using `typing.Protocol` and `importlib`. Objective The goal is to simulate a modular plugin system where code is discovered dynamically from...	03-15 13:15	Success	-	View
exp_self.20260315131222.014_20260315_131254 Paper: self.20260315131222.014	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315131222.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 13:12	Success	-	View
exp_pytrain.20260315130947.009_20260315_131008 Paper: pytrain.20260315130947.009	Python Skill Fallback Title: Protocol-Based Plugin Loader with ImportLib Simulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 13:10	Success	-	View
exp_self.20260315130713.013_20260315_130739 Paper: self.20260315130713.013	Self-Directed SSM Strategy Stress Test This benchmark evaluates the hypothesis that a disciplined memory policy within a State Space Model (SSM) architecture improves throughput under constrained VRAM (8GB). Methodology We compare two variants of a recurrent SSM block: 1. **Abla...	03-15 13:07	Success	-	View
exp_self.20260315130411.012_20260315_130439 Paper: self.20260315130411.012	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315130411.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 13:04	Success	-	View
exp_pytrain.20260315130147.008_20260315_130208 Paper: pytrain.20260315130147.008	Dynamic Package Loader with PEP 695 Type Constraints This benchmark demonstrates the integration of modern Python type hinting (PEP 695) with runtime dynamic module loading. It simulates a plugin architecture where a temporary Python package is constructed programmatically, loaded via `import...	03-15 13:02	Success	-	View
exp_self.20260315125940.011_20260315_130006 Paper: self.20260315125940.011	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315125940.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 13:00	Success	-	View
exp_self.20260315125635.010_20260315_125702 Paper: self.20260315125635.010	Self-directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies significantly improves throughput and efficiency under strict 8GB VRAM constraints compared to standard full-context c...	03-15 12:57	Success	-	View
exp_pytrain.20260315125347.007_20260315_125417 Paper: pytrain.20260315125347.007	Dynamic Plugin Registry with Runtime Type Checking This benchmark demonstrates a robust, dependency-free plugin architecture using Python's standard library. It leverages `typing.Protocol` for structural subtyping (duck typing with static and runtime verification) and `importlib` for dynami...	03-15 12:54	Success	-	View
exp_self.20260315125051.009_20260315_125123 Paper: self.20260315125051.009	Self-directed benchmark: ssm strategy stress test This benchmark evaluates the hypothesis that applying State Space Model (SSM) memory principles—specifically a disciplined memory policy and dynamic precision—improves inference throughput under strict 8GB VRAM constraints compared to a sta...	03-15 12:51	Success	-	View
exp_pytrain.20260315124749.006_20260315_124816 Paper: pytrain.20260315124749.006	Modular Configuration Registry Benchmark This benchmark tests the implementation of a robust, type-safe configuration management system using Python's standard library. The task requires the creation of a `config_registry` system that enforces strict typing via `typing.Protocol` a...	03-15 12:48	Success	-	View
exp_self.20260315124526.008_20260315_124548 Paper: self.20260315124526.008	Self-directed benchmark: ssm strategy stress test Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy improves throughput under strict 8GB VRAM constraints. It compares a standard Transformer-style KV-Cache appr...	03-15 12:45	Success	-	View
exp_self.20260315124234.007_20260315_124254 Paper: self.20260315124234.007	Self-directed benchmark: ssm strategy stress test This benchmark evaluates the impact of a disciplined memory policy and mixed precision on a State Space Model (SSM) simulation. Hypothesis Applying SSM inference with chunked processing and dynamic precision (FP16) significantly reduces VRA...	03-15 12:43	Success	-	View
exp_pytrain.20260315123917.005_20260315_123943 Paper: pytrain.20260315123917.005	Strict Metadata Validator and Dependency Resolver This project implements a lightweight package manager simulation in Python, focusing on strict type enforcement and robust dependency resolution. Features - Strict Typing: Uses `typing.TypedDict` to enforce the structure of package meta...	03-15 12:39	Success	-	View
exp_self.20260315123655.006_20260315_123720 Paper: self.20260315123655.006	SSM Strategy Stress Test: Memory vs. Throughput This benchmark evaluates the hypothesis that applying State Space Model (SSM) techniques with a disciplined memory policy (specifically, chunked recurrence vs. unrolled convolution) improves throughput under constrained memory (8GB VRAM tar...	03-15 12:37	Success	-	View
exp_self.20260315123337.005_20260315_123407 Paper: self.20260315123337.005	SSM Strategy Stress Test Benchmark This benchmark evaluates the performance of a Selective State Space Model (SSM) implementation under different memory and precision policies. It compares a baseline floating-point implementation against an optimized variant that leverages d...	03-15 12:34	Success	-	View
exp_pytrain.20260315123026.004_20260315_123100 Paper: pytrain.20260315123026.004	Python Skill Fallback Title: Strictly Typed Modular CLI Pipeline - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 12:31	Success	-	View
exp_self.20260315122735.004_20260315_122759 Paper: self.20260315122735.004	SSM Strategy Stress Test This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to standard attention mechanisms. Setup: We compare a standard Self...	03-15 12:28	Success	-	View
exp_pytrain.20260315122408.003_20260315_122445 Paper: pytrain.20260315122408.003	Robust Dynamic Plugin Loader with Structural Subtyping This benchmark demonstrates a robust plugin system architecture using Python's standard library. The goal is to simulate an autonomous system that: 1. Dynamically generates a temporary package structure on disk using `tempfile` and `pat...	03-15 12:24	Success	-	View
exp_self.20260315122206.003_20260315_122229 Paper: self.20260315122206.003	Self-directed benchmark: ssm strategy stress test Objective This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies and dynamic precision can improve throughput under constrained memory environments (8GB VRAM target). It com...	03-15 12:22	Success	-	View
exp_self.20260315121902.002_20260315_121929 Paper: self.20260315121902.002	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315121902.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 12:19	Success	-	View
exp_pytrain.20260315121544.002_20260315_121612 Paper: pytrain.20260315121544.002	Generic Plugin Registry Benchmark using PEP 695 This benchmark tests the implementation of a generic plugin registry utilizing Python 3.12's Type Parameter Syntax (PEP 695). It aims to reduce boilerplate associated with `typing.Generic` while maintaining strict type safety and runtime be...	03-15 12:16	Success	-	View
exp_self.20260315121302.001_20260315_121335 Paper: self.20260315121302.001	SSM Strategy Stress Test Benchmark This benchmark evaluates the performance and memory efficiency of an optimized State Space Model (SSM) implementation against a standard Transformer baseline. The focus is on a "disciplined memory policy," utilizing techniques like key-valu...	03-15 12:13	Success	-	View
exp_pytrain.20260315120853.001_20260315_120933 Paper: pytrain.20260315120853.001	Benchmark: Strict pyproject.toml Validator with TypedDict This benchmark evaluates a custom, recursive runtime validation engine for complex nested data structures (simulating `pyproject.toml` PEP 518/621 standards) using Python's standard `typing` module. It specifically tests the introspection o...	03-15 12:09	Success	-	View
exp_pytrain.20260315120346.006_20260315_120427 Paper: pytrain.20260315120346.006	Dynamic Plugin Loader with Protocol Constraints This coding drill validates a hypothesis about autonomous systems leveraging Python's `typing.Protocol` for structural subtyping and `importlib` for runtime module discovery. Objective Create a robust, dependency-free plugin architecture ca...	03-15 12:06	Pending	-	View
exp_self.20260315120123.010_20260315_120146 Paper: self.20260315120123.010	SSM Strategy Stress Test This benchmark evaluates the performance characteristics of a State Space Model (SSM) implementation under strict memory constraints. It simulates the inference throughput and VRAM usage of two configurations: 1. Baseline: Standard exec...	03-15 12:01	Success	-	View
exp_self.20260315115742.009_20260315_115813 Paper: self.20260315115742.009	Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy, combined with disciplined memory policies and dynamic precision, maintains higher throughput than standard quadratic-attention mechanisms under strict 8GB VRAM...	03-15 11:58	Success	-	View
exp_pytrain.20260315115445.005_20260315_115508 Paper: pytrain.20260315115445.005	Strict-Typed Dynamic Plugin Loader Overview This benchmark evaluates the ability to construct a robust, extensible plugin architecture using Python's standard `importlib` for dynamic module discovery and `typing.Protocol` for strict interface enforcement. Problem Statement T...	03-15 11:55	Success	-	View
exp_self.20260315115245.008_20260315_115311 Paper: self.20260315115245.008	Self-directed benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy (including dynamic precision) improves throughput while adhering to strict VRAM constraints (< 8GB). It compa...	03-15 11:53	Success	-	View
exp_self.20260315115021.007_20260315_115042 Paper: self.20260315115021.007	SSM Strategy Stress Test: Memory vs. Throughput Overview This benchmark evaluates the performance impact of a disciplined memory policy on State Space Models (SSMs). It compares a Baseline (Ablated) configuration against an Optimized (Innovation) configuration that leverages dyna...	03-15 11:50	Success	-	View
exp_pytrain.20260315114715.004_20260315_114738 Paper: pytrain.20260315114715.004	Strictly Typed Modular Plugin System Benchmark ID: `strict_typing_plugin_system` Hypothesis: An autonomous coding system can effectively utilize Python's packaging conventions and advanced static typing features to build a robust, extensible data processing framework w...	03-15 11:47	Success	-	View
exp_self.20260315114503.006_20260315_114531 Paper: self.20260315114503.006	Self-directed benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying a State Space Model (SSM) with a disciplined memory policy and dynamic precision (bfloat16) significantly improves inference throughput and reduces VRAM footprint compared t...	03-15 11:45	Success	-	View
exp_self.20260315114147.005_20260315_114215 Paper: self.20260315114147.005	Benchmark: SSM Strategy Stress Test This benchmark evaluates the performance of a standard Transformer architecture (Baseline) against a State Space Model (SSM) simulation (Innovation) under constrained memory conditions. Hypothesis Applying an SSM strategy with disciplined m...	03-15 11:42	Success	-	View
exp_pytrain.20260315113919.003_20260315_113938 Paper: pytrain.20260315113919.003	Python Skill Fallback Title: Typed Async Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 11:39	Success	-	View
exp_self.20260315113637.004_20260315_113657 Paper: self.20260315113637.004	SSM Strategy Stress Test Benchmark This repository contains a minimal benchmark designed to evaluate the hypothesis that State Space Model (SSM) architectures with disciplined memory policies provide superior throughput and memory efficiency compared to standard Attention-ba...	03-15 11:37	Success	-	View
exp_self.20260315113343.003_20260315_113401 Paper: self.20260315113343.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315113343.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 11:34	Success	-	View
exp_pytrain.20260315113100.002_20260315_113121 Paper: pytrain.20260315113100.002	PEP 695 Generic Container Benchmark This benchmark evaluates your ability to implement modern Python 3.12+ features, specifically PEP 695 (Type Parameter Syntax), within a robust, package-ready structure. Problem Statement Design a thread-safe generic key-value cache named `S...	03-15 11:31	Success	-	View
exp_self.20260315112837.002_20260315_112859 Paper: self.20260315112837.002	Self-directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy, utilizing a disciplined memory policy, significantly improves inference throughput compared to a standard Transformer baseline under strict 8GB VRAM constr...	03-15 11:29	Success	-	View
exp_self.20260315112528.001_20260315_112558 Paper: self.20260315112528.001	Self-directed benchmark: ssm strategy stress test Hypothesis Applying SSM (State Space Model) with a disciplined memory policy improves throughput and efficiency under 8GB VRAM constraints compared to standard attention-based architectures. Plan 1. Environment: PyTorch script runnable...	03-15 11:26	Success	-	View
exp_pytrain.20260315112235.001_20260315_112312 Paper: pytrain.20260315112235.001	Python Skill Fallback Title: Runtime Type-Checked Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 11:23	Success	-	View
exp_self.20260315103305.015_20260315_103342 Paper: self.20260315103305.015	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Model (SSM) architectures with a disciplined memory policy (specifically gradient checkpointing and chunked state management) improves throughput under strict 8GB VRAM co...	03-15 10:33	Pending	-	View
exp_pytrain.20260315102855.012_20260315_102929 Paper: pytrain.20260315102855.012	Python Skill Fallback Title: Type-Safe Plugin Architecture with Versioning Metadata - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 10:29	Success	-	View
exp_self.20260315102550.014_20260315_102631 Paper: self.20260315102550.014	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the efficiency of State Space Models (SSM) strategies against standard Transformer-based attention mechanisms. Specifically, it tests the hypothesis that applying an SSM strategy with a disciplined memory p...	03-15 10:26	Success	-	View
exp_pytrain.20260315102230.011_20260315_102303 Paper: pytrain.20260315102230.011	Python Skill Fallback Title: Type-Safe CLI Application Architecture - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 10:23	Success	-	View
exp_self.20260315101504.013_20260315_101533 Paper: self.20260315101504.013	SSM Strategy Stress Test This benchmark evaluates the memory efficiency and throughput of a State Space Model (SSM) strategy compared to a standard Attention-based Transformer baseline under constrained memory conditions. Hypothesis Applying SSM with a disc...	03-15 10:20	Success	-	View
exp_self.20260315101155.012_20260315_101220 Paper: self.20260315101155.012	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315101155.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 10:12	Success	-	View
exp_pytrain.20260315100812.010_20260315_100848 Paper: pytrain.20260315100812.010	Python Skill Fallback Title: Dynamic ZipApp Construction with Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 10:08	Success	-	View
exp_self.20260315100514.011_20260315_100540 Paper: self.20260315100514.011	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315100514.011 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 10:05	Success	-	View
exp_pytrain.20260315100155.009_20260315_100223 Paper: pytrain.20260315100155.009	Typed Configuration Package with CLI Interface Overview This benchmark evaluates a system's ability to generate a Python script that implements a robust configuration management module. The script must utilize advanced typing features (`typing.TypedDict`) for schema definition and `argp...	03-15 10:02	Success	-	View
exp_self.20260315095850.010_20260315_095926 Paper: self.20260315095850.010	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying a Selective State Space Model (SSM) with a disciplined memory policy improves inference throughput under strict 8GB VRAM constraints compared to standard Transformer Attention mechanisms...	03-15 09:59	Success	-	View
exp_pytrain.20260315095459.008_20260315_095600 Paper: pytrain.20260315095459.008	Generic Dependency Resolver Benchmark Overview This benchmark tests the implementation of a robust generic dependency resolver using Python's standard library type system features (PEP 484, PEP 695 concepts). Objective Implement a resolver that can process package dependencies,...	03-15 09:56	Success	-	View
exp_self.20260315095145.009_20260315_095220 Paper: self.20260315095145.009	SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) architectures with a disciplined memory policy improves throughput and reduces VRAM overhead compared to standard attention-based mechanisms under constr...	03-15 09:52	Success	-	View
exp_pytrain.20260315094754.007_20260315_094849 Paper: pytrain.20260315094754.007	Dynamic Plugin Architecture with Type Safety This benchmark verifies the ability to dynamically scaffold a Python package structure in a runtime environment, utilizing Python's `typing` module to enforce structural subtyping (Protocol) and `importlib` to load the generated code. Objec...	03-15 09:48	Success	-	View
exp_self.20260315094459.008_20260315_094530 Paper: self.20260315094459.008	Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the performance characteristics of a State Space Model (SSM) strategy against a standard dense baseline. The hypothesis is that applying an SSM with a disciplined memory policy (chunked inference and state...	03-15 09:45	Success	-	View
exp_self.20260315094112.007_20260315_094201 Paper: self.20260315094112.007	This benchmark is designed to evaluate the hypothesis that an SSM-based architecture, when coupled with a disciplined me... The implementation simulates a standard Transformer layer (Baseline) against a Recurrent SSM layer (Innovation). Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark validates the memory efficiency and throughput of a S...	03-15 09:42	Success	-	View
exp_pytrain.20260315093814.006_20260315_093844 Paper: pytrain.20260315093814.006	Python Skill Fallback Title: Type-Safe Plugin Registry with Dependency Constraints - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 09:38	Success	-	View
exp_self.20260315093521.006_20260315_093559 Paper: self.20260315093521.006	Self-directed benchmark: SSM Strategy Stress Test This repository contains a runnable benchmark designed to test the hypothesis: Applying SSM (State Space Model) logic with a disciplined memory policy improves throughput under 8GB constraints. Objective To compare the VRAM usage and infe...	03-15 09:36	Success	-	View
exp_pytrain.20260315093057.005_20260315_093203 Paper: pytrain.20260315093057.005	Strict Dependency Resolver Engine Overview This benchmark tests the ability to implement a core component of package management systems: the Dependency Resolver. The goal is to construct a robust, type-safe engine that determines a valid installation plan given a set of pac...	03-15 09:32	Success	-	View
exp_self.20260315092818.005_20260315_092859 Paper: self.20260315092818.005	Self-directed Benchmark: SSM Strategy Stress Test Hypothesis Applying State Space Models (SSM) with a disciplined memory policy and dynamic precision improves throughput under 8GB VRAM constraints compared to standard dense attention mechanisms. Benchmark Plan We compare a standard Transfo...	03-15 09:29	Success	-	View
exp_pytrain.20260315092413.004_20260315_092459 Paper: pytrain.20260315092413.004	Python Skill Fallback Title: Dynamic Plugin Loader with Structural Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 09:25	Success	-	View
exp_self.20260315092128.004_20260315_092202 Paper: self.20260315092128.004	SSM Strategy Stress Test This benchmark evaluates the impact of a disciplined memory management policy (chunked recurrent processing) on State Space Model (SSM) workloads under tight VRAM constraints. Hypothesis Applying an SSM with a disciplined memory policy...	03-15 09:22	Success	-	View
exp_pytrain.20260315091747.003_20260315_091824 Paper: pytrain.20260315091747.003	Type-Safe Plugin Registry with Async Dispatch This benchmark implements a modular task runner using Python's standard library to demonstrate a clean separation of interface definition, implementation registration, and asynchronous execution. Design Brief Modern software architecture re...	03-15 09:18	Success	-	View
exp_self.20260315091355.003_20260315_091503 Paper: self.20260315091355.003	Self-directed Benchmark: SSM Strategy Stress Test Hypothesis Applying an SSM (State Space Model) strategy with a disciplined memory policy improves throughput (tokens/sec) and reduces VRAM footprint compared to a naive implementation under 8GB VRAM constraints. Abstract This benchmark test...	03-15 09:15	Success	-	View
exp_self.20260315091056.002_20260315_091124 Paper: self.20260315091056.002	Self-directed benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy improves throughput and reduces VRAM usage compared to standard attention-based baselines. The test compares...	03-15 09:11	Success	-	View
exp_pytrain.20260315090730.002_20260315_090806 Paper: pytrain.20260315090730.002	PEP 695 Generic Registry Benchmark This benchmark tests the implementation of a generic type-safe registry using Python 3.12's PEP 695 syntax. It verifies syntax correctness, type parameter scoping, and runtime behavior while measuring throughput. Requirements - Python 3.12...	03-15 09:08	Success	-	View
exp_self.20260315090448.001_20260315_090531 Paper: self.20260315090448.001	Self-Directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the memory efficiency and throughput of a State Space Model (SSM) strategy versus a standard dense attention baseline. Hypothesis Applying an SSM approach with a disciplined memory policy (fixed state recurrence) ma...	03-15 09:05	Success	-	View
exp_pytrain.20260315090036.001_20260315_090129 Paper: pytrain.20260315090036.001	Benchmark: Structural Typing and Dynamic Plugin Loader Overview This coding drill evaluates the ability to design a robust, type-safe plugin architecture using Python's standard library. The benchmark focuses on Structural Typing (using `typing.Protocol` and `@runtime_checkable`) and **Pack...	03-15 09:01	Success	-	View
exp_pytrain.20260315084527.016_20260315_084555 Paper: pytrain.20260315084527.016	Python Skill Fallback Title: Type-Validated Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 08:45	Success	-	View
exp_self.20260315084217.018_20260315_084254 Paper: self.20260315084217.018	SSM Strategy Stress Test Overview This benchmark evaluates the performance of State Space Model (SSM) inference under constrained memory conditions (8GB VRAM limit). It compares two modes: 1. Baseline (Ablated): Uses standard memory handling and full precision...	03-15 08:43	Success	-	View
exp_pytrain.20260315083851.015_20260315_083924 Paper: pytrain.20260315083851.015	Typed Observable State Container This benchmark implements a "mini-package" within a single file to demonstrate robust state management using Python's advanced typing features. Design Hypothesis Explicit use of Python's `typing` system (Generics and Protocols) enforces str...	03-15 08:39	Success	-	View
exp_self.20260315083622.017_20260315_083656 Paper: self.20260315083622.017	SSM Strategy Stress Test This benchmark evaluates the efficiency of a State Space Model (SSM) implementation under constrained VRAM (8GB limit). It contrasts a naive implementation against a memory-disciplined variant that utilizes dynamic chunking and cache optimi...	03-15 08:37	Success	-	View
exp_pytrain.20260315083130.014_20260315_083203 Paper: pytrain.20260315083130.014	Python Skill Fallback Title: Strictly Typed Module with Dynamic Protocol Resolution - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 08:32	Success	-	View
exp_self.20260315082827.016_20260315_082853 Paper: self.20260315082827.016	SSM Strategy Stress Test This benchmark evaluates the performance impact of applying a disciplined memory policy to State Space Model (SSM) operations under constrained VRAM environments (target: < 8GB). Hypothesis Applying SSM architectures with a disciplined memo...	03-15 08:28	Success	-	View
exp_pytrain.20260315082426.013_20260315_082518 Paper: pytrain.20260315082426.013	Dynamic Plugin Loader with Runtime Type Enforcement Overview This drill challenges you to build an extensible system that simulates a lightweight inference engine plugin architecture. You must implement a `PluginRegistry` that dynamically discovers, loads, and validates Python modules from t...	03-15 08:25	Success	-	View
exp_self.20260315082119.015_20260315_082212 Paper: self.20260315082119.015	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315082119.015 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 08:22	Success	-	View
exp_pytrain.20260315081801.012_20260315_081833 Paper: pytrain.20260315081801.012	Strictly-Typed Virtual Component Loader This benchmark tests the ability to construct a robust, dependency-free loader mechanism simulating Python packaging entry points (e.g., `package.module:Class`). It utilizes advanced typing features (`typing.Protocol`, `typing.Type`, Generi...	03-15 08:18	Success	-	View
exp_self.20260315081329.014_20260315_081409 Paper: self.20260315081329.014	Self-directed benchmark: ssm strategy stress test Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies improves throughput (tokens/sec) while maintaining lower VRAM usage compared to standard Transformer attentio...	03-15 08:15	Success	-	View
exp_pytrain.20260315080952.011_20260315_081034 Paper: pytrain.20260315080952.011	Runtime-Checked Dynamic Plugin Loader This benchmark tests the ability to construct a robust, type-safe plugin architecture using Python's standard library. Objective Implement a `PluginLoader` system that: 1. Dynamically discovers Python modules in a target directory. 2. Inspe...	03-15 08:10	Success	-	View
exp_self.20260315080722.013_20260315_080749 Paper: self.20260315080722.013	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput under strict VRAM constraints (simulating an 8GB environment) compared to standard Attention-based architec...	03-15 08:07	Success	-	View
exp_self.20260315080437.012_20260315_080502 Paper: self.20260315080437.012	SSM Strategy Stress Test This benchmark evaluates a State Space Model (SSM) based strategy against a standard Transformer attention baseline. The specific hypothesis is that the linear complexity of an SSM architecture (simulated here via a performant PyTorch appro...	03-15 08:05	Success	-	View
exp_pytrain.20260315080138.010_20260315_080203 Paper: pytrain.20260315080138.010	Protocol-Based Dynamic Extension Loader This benchmark tests the ability to design a robust, type-safe plugin system using Python's `typing.Protocol` for structural subtyping and `importlib` for runtime discovery. It simulates a package environment to verify strict interface adhe...	03-15 08:02	Success	-	View
exp_self.20260315075809.011_20260315_075839 Paper: self.20260315075809.011	Self-directed Benchmark: SSM Strategy Stress Test Overview This benchmark investigates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy (specifically selective activation caching and dynamic precision) improves throughput under strict 8GB VRAM constrai...	03-15 07:59	Success	-	View
exp_self.20260315075521.010_20260315_075553 Paper: self.20260315075521.010	This benchmark compares a naive State Space Model (SSM) implementation against an optimized variant employing mixed prec... README.md SSM Strategy Stress Test Benchmark This repository contains a benchmark designed to test the hypothesis that applying SSM architectures with disciplined memory policies improves throughput under strict hardware constraints (8GB VR...	03-15 07:55	Success	-	View
exp_pytrain.20260315075236.009_20260315_075300 Paper: pytrain.20260315075236.009	Dynamic Protocol-Based Plugin Loader Benchmark This benchmark evaluates the ability of an autonomous agent to design a modular plugin system using Python's `typing.Protocol` for structural subtyping and `importlib` for dynamic runtime loading. Objective Create a self-contained script `b...	03-15 07:53	Success	-	View
exp_self.20260315074958.009_20260315_075033 Paper: self.20260315074958.009	SSM Strategy Stress Test This benchmark evaluates the hypothesis that State Space Models (SSM) with a disciplined memory policy improve throughput under strict VRAM constraints (8GB) compared to standard quadratic-attention mechanisms. Methodology We compare tw...	03-15 07:50	Success	-	View
exp_pytrain.20260315074550.008_20260315_074611 Paper: pytrain.20260315074550.008	Python Skill Fallback Title: Strictly Typed Pyproject Metadata Validator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 07:46	Success	-	View
exp_self.20260315074250.008_20260315_074318 Paper: self.20260315074250.008	Self-directed benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that a State Space Model (SSM) utilizing a disciplined memory policy (specifically, state truncation and selective checkpointing) achieves higher throughput and lower VRAM consumption compare...	03-15 07:43	Success	-	View
exp_pytrain.20260315073933.007_20260315_074004 Paper: pytrain.20260315073933.007	Strictly Typed Dynamic Module Registry This coding drill benchmarks your ability to construct a robust, type-safe internal registry system that simulates a Python package's modular architecture. Overview The goal is to create a script `benchmark.py` that simulates a mini-package...	03-15 07:40	Success	-	View
exp_self.20260315073627.007_20260315_073702 Paper: self.20260315073627.007	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (specifically, chunking and dynamic precision) improves inference throughput and reduces VRAM usage compared to standard attent...	03-15 07:37	Success	-	View
exp_pytrain.20260315073247.006_20260315_073324 Paper: pytrain.20260315073247.006	Dynamic Component Registry with Runtime Protocol Validation Overview This benchmark evaluates the ability of a Python script to dynamically construct a library architecture, emulate a plugin system using `importlib`, and enforce strict runtime type validation using `typing.Protocol`. Objective Creat...	03-15 07:33	Success	-	View
exp_self.20260315072951.006_20260315_073026 Paper: self.20260315072951.006	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to a standard dense (ablated) baseline. Methodology The benchm...	03-15 07:30	Success	-	View
exp_self.20260315072611.005_20260315_072636 Paper: self.20260315072611.005	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315072611.005 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 07:26	Success	-	View
exp_pytrain.20260315072242.005_20260315_072333 Paper: pytrain.20260315072242.005	Python Skill Fallback Title: Dynamic Module Loader with Runtime Type Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 07:23	Success	-	View
exp_self.20260315072006.004_20260315_072045 Paper: self.20260315072006.004	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that a State Space Model (SSM) approach, utilizing a disciplined memory policy (recurrent state management), yields superior throughput and lower VRAM consumption compared to a standard Attention-base...	03-15 07:20	Success	-	View
exp_pytrain.20260315071617.004_20260315_071640 Paper: pytrain.20260315071617.004	Typed Plugin Registry and CLI Dispatcher Benchmark This benchmark evaluates a lightweight, modular Python framework that enforces strict interface contracts using `typing.Protocol` and runtime type checking. The system dynamically loads and executes "plugins" based on a defined structure, e...	03-15 07:16	Success	-	View
exp_self.20260315071258.003_20260315_071344 Paper: self.20260315071258.003	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the efficiency of State Space Models (SSM) compared to standard Transformer architectures under strict memory constraints (8GB VRAM limit). Overview The benchmark compares two implementations of a sequence processin...	03-15 07:13	Success	-	View
exp_pytrain.20260315070922.003_20260315_070949 Paper: pytrain.20260315070922.003	Robust Dynamic Plugin Loader with Type Safety This benchmark demonstrates a modular package architecture simulation using Python's standard library. It focuses on structural subtyping (`typing.Protocol`) and runtime validation (`inspect`, `isinstance`) to create a robust plugin system...	03-15 07:09	Success	-	View
exp_self.20260315070615.002_20260315_070656 Paper: self.20260315070615.002	Self-Directed Benchmark: SSM Strategy Stress Test This repository contains a runnable benchmark designed to test the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy improves throughput under constrained VRAM (8GB). Overview The benchmark sim...	03-15 07:07	Success	-	View
exp_pytrain.20260315070205.002_20260315_070308 Paper: pytrain.20260315070205.002	Modern Generic Utilities with PEP 695 This benchmark verifies the implementation of modern Python generic types using PEP 695 Type Parameter Syntax (introduced in Python 3.12) within a strictly hygienic module structure. Goal To ensure the coding system can: 1. Define generic c...	03-15 07:03	Success	-	View
exp_self.20260315065841.001_20260315_065909 Paper: self.20260315065841.001	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315065841.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 06:59	Success	-	View
exp_pytrain.20260315065531.001_20260315_065605 Paper: pytrain.20260315065531.001	Strictly Typed PyProject.toml Generator Benchmark This benchmark evaluates a Python script's ability to leverage advanced type hinting features (`dataclasses`, `typing.Protocol`, and `Literal`) to construct a strictly typed domain model for Python project metadata (PEP 621). The objective...	03-15 06:56	Success	-	View
exp_pytrain.20260315065100.008_20260315_065145 Paper: pytrain.20260315065100.008	Strictly Typed Dependency Injection Container Benchmark This benchmark evaluates a modern Dependency Injection (DI) implementation in pure Python. It leverages PEP 695 (Type Parameter Syntax) to eliminate boilerplate associated with `typing.Generic` and `TypeVar`. The design utilizes `typing...	03-15 06:51	Success	-	View
exp_self.20260315064827.009_20260315_064906 Paper: self.20260315064827.009	SSM Strategy Stress Test Benchmark This repository contains a minimal benchmark designed to test the hypothesis that a State Space Model (SSM) utilizing a disciplined memory policy (specifically dynamic precision and efficient state caching) achieves higher throughput under...	03-15 06:49	Success	-	View
exp_self.20260315064521.008_20260315_064553 Paper: self.20260315064521.008	SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy, utilizing a disciplined memory policy, improves throughput and reduces VRAM usage compared to a standard full-context attention baseline under tight m...	03-15 06:46	Success	-	View
exp_pytrain.20260315064216.007_20260315_064244 Paper: pytrain.20260315064216.007	Strictly-Typed Dynamic Plugin Loader Overview This coding drill validates the hypothesis that structural subtyping (via `typing.Protocol`) combined with dynamic module generation (`types.ModuleType`) creates a robust, type-safe plugin system without sacrificing the flexibility...	03-15 06:42	Success	-	View
exp_self.20260315063844.007_20260315_063939 Paper: self.20260315063844.007	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that a disciplined State Space Model (SSM) memory policy yields superior throughput and memory efficiency compared to standard attention mechanisms under strict 8GB VRAM constraints. Hypothesis Ap...	03-15 06:40	Success	-	View
exp_pytrain.20260315063602.006_20260315_063617 Paper: pytrain.20260315063602.006	Robust Dynamic Plugin Loader with Type Safety Objective This benchmark simulates a robust plugin loading system similar to those found in large-scale inference libraries (like `vllm` or `diffusers`). It tests the ability to define strict interfaces using Python's `typing.Protocol`, pro...	03-15 06:36	Success	-	View
exp_self.20260315063318.006_20260315_063400 Paper: self.20260315063318.006	Benchmark: SSM Strategy Stress Test This repository contains a lightweight benchmark designed to evaluate the efficiency of Selective State Space Models (SSM) versus standard Transformer architectures under strict memory constraints (8GB VRAM). Hypothesis Applying SSM archite...	03-15 06:34	Success	-	View
exp_pytrain.20260315062924.005_20260315_062957 Paper: pytrain.20260315062924.005	Dynamic Plugin Loader with Protocol Validation Objective Evaluate the performance and correctness of a dynamic plugin architecture built on Python's `importlib` and structural subtyping via `typing.Protocol`. Hypothesis Using `typing.Protocol` allows for a robust plugin system w...	03-15 06:30	Success	-	View
exp_self.20260315062651.005_20260315_062723 Paper: self.20260315062651.005	SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (chunked processing) improves throughput and manages VRAM more effectively under strict 8GB constraints compared to a...	03-15 06:27	Success	-	View
exp_pytrain.20260315061300.004_20260315_061347 Paper: pytrain.20260315061300.004	Strictly Typed Modular Data Pipeline Benchmark This document outlines the specifications for a self-validating Python coding drill focused on creating a strictly typed, modular data pipeline. Objective The goal is to implement a `pipeline.py` style module contained within `benchmark.py`...	03-15 06:23	Success	-	View
exp_self.20260315060926.004_20260315_061009 Paper: self.20260315060926.004	Self-directed benchmark: SSM strategy stress test This repository contains a minimal benchmark designed to test the hypothesis that applying State Space Model (SSM) strategies with disciplined memory policies improves throughput under constrained VRAM environments (specifically 8GB). Conte...	03-15 06:10	Success	-	View
exp_self.20260315060553.003_20260315_060626 Paper: self.20260315060553.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315060553.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 06:06	Success	-	View
exp_pytrain.20260315060216.003_20260315_060246 Paper: pytrain.20260315060216.003	Python Skill Fallback Title: Robust Async Plugin Loader with Structural Subtyping - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-15 06:02	Success	-	View
exp_self.20260315055921.002_20260315_060009 Paper: self.20260315055921.002	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260315055921.002 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-15 06:00	Success	-	View
exp_pytrain.20260315055518.002_20260315_055617 Paper: pytrain.20260315055518.002	PEP 695 Generic Result Monad Implementation Overview This benchmark implements a generic `Result[T, E]` Monad (a container for success or failure states) utilizing Python 3.12's Type Parameter Syntax (PEP 695). This syntax removes the boilerplate of importing `Generic` and `TypeVar`...	03-15 05:56	Success	-	View
exp_self.20260315055209.001_20260315_055258 Paper: self.20260315055209.001	SSM Strategy Stress Test Benchmark This repository contains a minimal benchmark designed to test the hypothesis that a State Space Model (SSM) utilizing a disciplined memory policy (specifically, chunked computation) achieves higher throughput and lower VRAM usage compared t...	03-15 05:53	Success	-	View
exp_pytrain.20260315054745.001_20260315_054818 Paper: pytrain.20260315054745.001	Type-Safe Plugin Loader Benchmark This benchmark verifies the implementation of a dynamic plugin loader that enforces structural subtyping (Protocols) at runtime. Objective Implement `ExtensionLoader.load(spec, protocol)` which: 1. Parses a string specification `module:attr...	03-15 05:48	Success	-	View
exp_self.20260314211910.042_20260314_211934 Paper: self.20260314211910.042	Self-directed benchmark: SSM Strategy Stress Test Hypothesis Applying SSM (State Space Model) architectures with a disciplined memory policy (specifically gradient checkpointing and selective state retention) improves throughput under 8GB VRAM constraints compared to standard eager executi...	03-14 21:19	Pending	-	View
exp_self.20260314211641.041_20260314_211703 Paper: self.20260314211641.041	README: SSM Strategy Stress Test Objective This benchmark validates the hypothesis that applying State Space Model (SSM) inference strategies with disciplined memory management significantly improves throughput (tokens/sec) while maintaining lower VRAM footprints compared...	03-14 21:17	Success	-	View
exp_pytrain.20260314211425.022_20260314_211445 Paper: pytrain.20260314211425.022	PEP 695 Generic CLI Manager This benchmark tests the ability to write modern, type-safe Python code utilizing PEP 695 (Type Parameter Syntax) introduced in Python 3.12. It combines this new syntax with standard library packaging conventions to create a robust CLI...	03-14 21:14	Success	-	View
exp_self.20260314210834.040_20260314_210859 Paper: self.20260314210834.040	SSM Strategy Stress Test Overview This benchmark evaluates the Hypothesis: applying SSM (State Space Model) with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to standard attention-based caching mechanisms. Concept We compa...	03-14 21:13	Success	-	View
exp_pytrain.20260314210615.021_20260314_210641 Paper: pytrain.20260314210615.021	Type-Safe Dynamic Plugin Loader This benchmark tests the ability to construct a mock Python package structure in memory, dynamically discover and load a plugin using `importlib`, and enforce strict adherence to `typing.Protocol` interfaces at runtime. Instructions 1. Ensu...	03-14 21:06	Success	-	View
exp_self.20260314210436.039_20260314_210455 Paper: self.20260314210436.039	Self-directed benchmark: SSM strategy stress test This repository contains a benchmark designed to test the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy and dynamic precision improves throughput under constrained VRAM environments (8GB). Methodology T...	03-14 21:05	Success	-	View
exp_self.20260314210210.038_20260314_210243 Paper: self.20260314210210.038	SSM Strategy Stress Test Benchmark This benchmark evaluates the performance of a State Space Model (SSM) inference implementation under strict memory constraints (8GB). It compares a Baseline implementation (naive memory management, standard precision) against an **Optim...	03-14 21:02	Success	-	View
exp_pytrain.20260314205947.020_20260314_210012 Paper: pytrain.20260314205947.020	Python Skill Fallback Title: Strictly-Typed Artifact Persistence System - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 21:00	Success	-	View
exp_self.20260314205801.037_20260314_205827 Paper: self.20260314205801.037	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy improves throughput under constrained VRAM (8GB limit). Objective Compare a standard Transformer architecture (Baselin...	03-14 20:58	Success	-	View
exp_self.20260314205536.036_20260314_205609 Paper: self.20260314205536.036	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the memory efficiency and throughput of a State Space Model (SSM) strategy compared to a standard Transformer baseline under constrained memory conditions (8GB VRAM). Hypothesis Applying SSMs with a discipl...	03-14 20:56	Success	-	View
exp_pytrain.20260314205334.019_20260314_205352 Paper: pytrain.20260314205334.019	Python Skill Fallback Title: Generic Plugin Registry and Module Encapsulation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 20:53	Success	-	View
exp_self.20260314204343.035_20260314_204405 Paper: self.20260314204343.035	SSM Strategy Stress Test: Memory & Throughput This benchmark evaluates the hypothesis that a State Space Model (SSM) simulation, operating with a disciplined memory policy, provides superior throughput and lower VRAM footprint compared to a standard Transformer attention baseline under...	03-14 20:52	Success	-	View
exp_self.20260314204048.034_20260314_204112 Paper: self.20260314204048.034	SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that a State Space Model (SSM) approach with disciplined memory management yields superior throughput and lower VRAM usage compared to standard Attention-based mechanisms under constrained memory (8GB...	03-14 20:41	Success	-	View
exp_pytrain.20260314203818.018_20260314_203839 Paper: pytrain.20260314203818.018	Dynamic Plugin Loader with Protocol Validation Overview This coding drill benchmarks your ability to construct a flexible, robust plugin architecture using Python's standard library. The task involves dynamic module discovery using `importlib` and structural interface enforcement using...	03-14 20:38	Success	-	View
exp_self.20260314203558.033_20260314_203647 Paper: self.20260314203558.033	Self-directed benchmark: ssm strategy stress test Overview This benchmark evaluates the effectiveness of memory optimization strategies for State Space Models (SSMs) under constrained VRAM conditions (8GB). It compares a baseline SSM implementation with memory policy optimizations against...	03-14 20:36	Success	-	View
exp_self.20260314203326.032_20260314_203403 Paper: self.20260314203326.032	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260314203326.032 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-14 20:34	Success	-	View
exp_pytrain.20260314203039.017_20260314_203121 Paper: pytrain.20260314203039.017	Python Skill Fallback Title: Strictly Typed Data Pipeline with Packaging Standards - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 20:31	Success	-	View
exp_self.20260314202755.031_20260314_202816 Paper: self.20260314202755.031	Self-Directed Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that State Space Models (SSMs) with a disciplined memory policy provide superior throughput and lower VRAM usage compared to standard Transformer attention mechanisms under strict memory constrain...	03-14 20:28	Success	-	View
exp_self.20260314202539.030_20260314_202606 Paper: self.20260314202539.030	SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying a State Space Model (SSM) strategy with a disciplined memory policy (specifically, chunked inference and dynamic precision) significantly improves throughput and reduces VRAM pressur...	03-14 20:26	Success	-	View
exp_pytrain.20260314202319.016_20260314_202340 Paper: pytrain.20260314202319.016	Python Skill Fallback Title: Dynamic Module Loader with Runtime Type Verification - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 20:23	Success	-	View
exp_self.20260314202053.029_20260314_202112 Paper: self.20260314202053.029	SSM Strategy Stress Test This benchmark evaluates a synthetic State Space Model (SSM) inference strategy against a standard Transformer-style KV-Cache approach. Hypothesis Applying an SSM-inspired disciplined memory policy (fixed state size + dynamic precision) imp...	03-14 20:21	Success	-	View
exp_self.20260314201752.028_20260314_201818 Paper: self.20260314201752.028	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy improves inference throughput under strict 8GB VRAM constraints compared to standard attention-based accumulation. Context Tradi...	03-14 20:18	Success	-	View
exp_pytrain.20260314201530.015_20260314_201553 Paper: pytrain.20260314201530.015	Generic Package Resource Loader using PEP 695 This benchmark tests the implementation of a type-safe generic resource loader using Python 3.12's PEP 695 Type Parameter Syntax. It verifies the ability to define a generic class `ResourceDecoder[T]` and utilize `importlib.resources` to re...	03-14 20:15	Success	-	View
exp_self.20260314201313.027_20260314_201336 Paper: self.20260314201313.027	This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory polic... Overview The benchmark compares a standard Transformer-based architecture (Baseline) against a linear-complexity SSM-inspired architecture (Innovation). * Baseline (Attention): Utilizes standard `nn.MultiheadAttention`. This mechanism s...	03-14 20:14	Success	-	View
exp_self.20260314201040.026_20260314_201101 Paper: self.20260314201040.026	SSM Strategy Stress Test This benchmark evaluates the performance characteristics of a State Space Model (SSM) implementation against a standard Transformer Attention baseline. The goal is to verify the hypothesis that an SSM architecture, when combined with a disc...	03-14 20:11	Success	-	View
exp_pytrain.20260314200825.014_20260314_200847 Paper: pytrain.20260314200825.014	Benchmark: Robust Package Structure Validator Objective This benchmark tests the ability to write a robust, type-safe Python tool using only the standard library (`typing`, `pathlib`, `contextlib`). The task is to simulate a Python package generation process and implement a validation...	03-14 20:08	Success	-	View
exp_self.20260314200628.025_20260314_200701 Paper: self.20260314200628.025	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260314200628.025 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-14 20:07	Success	-	View
exp_self.20260314200342.024_20260314_200403 Paper: self.20260314200342.024	Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark is designed to test the hypothesis that applying a State Space Model (SSM) with a disciplined memory policy improves throughput under 8GB VRAM constraints. It implements a synthetic Diagonal State Space Model (DSSM)...	03-14 20:04	Success	-	View
exp_pytrain.20260314200107.013_20260314_200130 Paper: pytrain.20260314200107.013	Strictly Typed Plugin Architecture with Dynamic Registry This benchmark evaluates the design and implementation of a strictly typed plugin system using Python's `typing.Protocol`. The script simulates a computational engine package structure, defining a `Kernel` interface and enforcing strict typ...	03-14 20:01	Success	-	View
exp_self.20260314195908.023_20260314_195939 Paper: self.20260314195908.023	Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that applying Selective State Space Models (SSM) with a disciplined memory policy (dynamic precision) improves throughput under 8GB VRAM constraints compared to standard attention mechanisms. Experime...	03-14 19:59	Success	-	View
exp_self.20260314195623.022_20260314_195647 Paper: self.20260314195623.022	This benchmark compares a standard Attention-based Transformer block against a simulated State Space Model (SSM) archite... 1. README.md SSM Strategy Stress Test Objective To verify the hypothesis that applying SSM (State Space Model) architectures with a disciplined memory policy significantly improves throughput (tokens/sec) and reduces VRAM usage compared to...	03-14 19:56	Success	-	View
exp_pytrain.20260314195414.012_20260314_195434 Paper: pytrain.20260314195414.012	Python Skill Fallback Title: Strictly Typed Auto-Registry System - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 19:54	Success	-	View
exp_self.20260314195229.021_20260314_195251 Paper: self.20260314195229.021	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260314195229.021 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-14 19:52	Success	-	View
exp_self.20260314194939.020_20260314_195012 Paper: self.20260314194939.020	Entropy-Based State Stagnation This benchmark tests the hypothesis that during fluent text generation (characterized by low entropy/uncertainty in the next-token prediction), the internal state of a State Space Model (SSM) remains relatively constant. By monitoring the e...	03-14 19:50	Success	-	View
exp_pytrain.20260314194757.011_20260314_194816 Paper: pytrain.20260314194757.011	Runtime-Verified Plugin Loader via Protocols This benchmark demonstrates a robust mechanism for loading and verifying Python plugins dynamically using Structural Subtyping (Protocols) rather than traditional Inheritance (ABCs). Context In plugin architectures, developers often nee...	03-14 19:48	Success	-	View
exp_self.20260314194557.019_20260314_194637 Paper: self.20260314194557.019	Frequency-Modulated State Layers (FMSL) Paper ID: self.20260314194557.019 - Hypothesis: Semantic processing happens early. We can use FP16 for state updates in the first 50% of layers and FP8 for the last 50%. This 'frequency modulation' of precision saves VRAM bandwidth during t...	03-14 19:46	Success	-	View
exp_self.20260314194408.018_20260314_194434 Paper: self.20260314194408.018	Adaptive SSM-Attention Router Benchmark This benchmark validates the "Adaptive SSM-Attention Router" hypothesis: that a learned router can identify "hard" tokens requiring global attention and route "easy" tokens to a linear SSM path, resulting in sub-linear KV Cache memory scali...	03-14 19:44	Success	-	View
exp_self.20260314194213.017_20260314_194240 Paper: self.20260314194213.017	Tiered Precision State Cache (TPSC) Paper ID: self.20260314194213.017 - Hypothesis: State-Space models rely on a recurrent hidden state. The influence of distant tokens on the current gradient is mathematically bounded. By tiering the cache (FP32 for active, FP16 for history)...	03-14 19:42	Success	-	View
exp_pytrain.20260314194019.010_20260314_194046 Paper: pytrain.20260314194019.010	Dynamic Async Plugin Loader with Type Safety This benchmark tests proficiency in Python's `asyncio`, `typing` protocols, and dynamic module loading. The system constructs a temporary plugin package structure on disk, writes an asynchronous class implementation, and loads it using stan...	03-14 19:40	Success	-	View
exp_self.20260314193812.016_20260314_193846 Paper: self.20260314193812.016	DEDP Benchmark: Dynamic Precision for SSMs This repository contains a minimal, runnable benchmark for Delta-Encoded Dynamic Precision (DEDP). Hypothesis Small changes in the recurrent state (low delta) can be safely stored in INT8, while large changes (high delta) require FP16 t...	03-14 19:38	Success	-	View
exp_self.20260314193546.015_20260314_193625 Paper: self.20260314193546.015	Temporal Decay Quantization (TDQ) Paper ID: self.20260314193546.015 - Hypothesis: Older history in the recurrent state is less critical for immediate next-token prediction than recent history. We can quantize the 'tail' of the state history to 4-bit or 8-bit while keeping t...	03-14 19:36	Success	-	View
exp_pytrain.20260314193340.009_20260314_193406 Paper: pytrain.20260314193340.009	Strictly Typed Modular Entry Point This coding drill validates your ability to design a strictly typed, modular Python application structure within a single script. It simulates package distribution metadata (`__version__`, `__all__`), defines a `Protocol` for interface enfo...	03-14 19:34	Success	-	View
exp_self.20260314193119.014_20260314_193159 Paper: self.20260314193119.014	CPU-Offloaded State Streaming with Prefetch Paper ID: self.20260314193119.014 - Hypothesis: Existing CPU offloading is sync/blocking. By creating a 'background thread' that predicts the next required state window and prefetches it to GPU VRAM before the SSM scan reaches it, we can...	03-14 19:32	Success	-	View
exp_self.20260314192907.013_20260314_192935 Paper: self.20260314192907.013	Contextual LoRA Switching via State Clustering This benchmark tests the hypothesis that an SSM's internal state can serve as a highly efficient signal for routing specialized domain experts (LoRA adapters). The Innovation Traditional LLMs use static weights or computationally expensive...	03-14 19:29	Success	-	View
exp_pytrain.20260314192717.008_20260314_192740 Paper: pytrain.20260314192717.008	Typed Dependency Graph Resolver Overview This benchmark evaluates the implementation of a robust package dependency resolver using modern Python static typing features (`Protocol`, `Generics`, `dataclasses`) and standard library packaging tools (`tomllib`). Objective The...	03-14 19:27	Success	-	View
exp_self.20260314192515.012_20260314_192540 Paper: self.20260314192515.012	Task-Gated Semantic State Pruning Paper ID: self.20260314192515.012 - Hypothesis: Not all history is useful for the next token prediction. By using a lightweight 'Gate' (similar to a gating mechanism in LSTMs but applied to the state dimension) driven by the current embeddi...	03-14 19:25	Success	-	View
exp_self.20260314192234.011_20260314_192334 Paper: self.20260314192234.011	Time-Aware Tiered Precision (TATP) for SSM States Paper ID: self.20260314192234.011 - Hypothesis: Recent history in an SSM is more sensitive to precision than ancient history. By storing t-1 states in FP16, t-10 in INT8, and t-50 in INT4, we can fit longer contexts on 8GB GPUs. - Plan: Mod...	03-14 19:23	Success	-	View
exp_pytrain.20260314192035.007_20260314_192101 Paper: pytrain.20260314192035.007	Strictly-Typed Model Registry and Configuration Loader Overview This benchmark demonstrates a robust, type-safe implementation of a Model Registry and Configuration Loader, inspired by the architecture of modern LLM frameworks like PyTorch and LitGPT. The Hypothesis Explicitly defining interfac...	03-14 19:21	Success	-	View
exp_self.20260314191817.010_20260314_191849 Paper: self.20260314191817.010	Entropy-Based Dynamic State Quantization README.md This benchmark explores Entropy-Based Dynamic State Quantization for State Space Models (SSMs). Hypothesis We hypothesize that the "cognitive load" of an SSM, measured by the entropy of its hidden state $h_t$, fluctuates durin...	03-14 19:18	Success	-	View
exp_self.20260314191621.009_20260314_191644 Paper: self.20260314191621.009	Variance-Gated Dynamic State Precision Benchmark Overview This benchmark tests the Variance-Gated Dynamic State Precision hypothesis. It posits that not all states in a State Space Model (SSM) require high precision (FP16). By monitoring the variance of the hidden state during inferen...	03-14 19:16	Success	-	View
exp_pytrain.20260314191445.006_20260314_191501 Paper: pytrain.20260314191445.006	Robust Dynamic Plugin Loader Benchmark Objective This benchmark evaluates the ability of an autonomous system to design a secure, extensible architecture using Python's standard library. Specifically, it tests the dynamic loading of Python modules (plugins) from a temporary file...	03-14 19:15	Success	-	View
exp_self.20260314191213.008_20260314_191238 Paper: self.20260314191213.008	Tiered-Precision SSM State Cache Paper ID: self.20260314191213.008 - Hypothesis: A tiered precision scheme (Hot=FP16, Cold=INT4) will double the effective context window of an SSM with negligible perplexity increase. - Plan: Implement a ring-buffer for the SSM state. Quant...	03-14 19:12	Success	-	View
exp_self.20260314191014.007_20260314_191041 Paper: self.20260314191014.007	Latent State Injection for RAG Overview This benchmark evaluates Latent State Injection, a novel approach to Retrieval-Augmented Generation (RAG) using State Space Models (SSMs). The Innovation Standard RAG systems retrieve raw text chunks, concatenate them with the...	03-14 19:10	Success	-	View
exp_pytrain.20260314190817.005_20260314_190842 Paper: pytrain.20260314190817.005	Python Skill Fallback Title: Dynamic Package Construction and Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 19:08	Success	-	View
exp_self.20260314190614.006_20260314_190647 Paper: self.20260314190614.006	Tiered-Precision State Cache for Mamba Overview This benchmark evaluates a Tiered-Precision State Cache designed for State Space Models (SSMs) like Mamba. The Problem: Long-context SSMs must maintain a massive hidden state (`h_t`) that grows or updates with every token....	03-14 19:06	Success	-	View
exp_gh_huggingface_transformers_20260314_190423 Paper: gh_huggingface_transformers	Hugging Face Transformers Efficiency Benchmark This benchmark evaluates the performance of the `transformers` library, focusing on efficient inference strategies for Large Language Models (LLMs). It highlights the library's optimization capabilities, specifically KV-Caching for gene...	03-14 19:04	Success	-	View
exp_pytrain.20260314190202.004_20260314_190224 Paper: pytrain.20260314190202.004	Type-Safe Virtual Package Manager Benchmark This benchmark tests the ability to write a robust, type-safe CLI application using Python's standard library. The candidate must implement a virtual package manager that handles dependencies, immutability, and argument parsing according to...	03-14 19:02	Success	-	View
exp_self.20260314185958.005_20260314_190030 Paper: self.20260314185958.005	Speculative RAG Skipping Paper ID: self.20260314185958.005 - Hypothesis: If the SSM state has low entropy (high confidence) regarding the next token, the answer is likely 'in memory'. If entropy spikes, we trigger RAG. This creates a 'Just-In-Time' retrieval system...	03-14 19:00	Success	-	View
exp_self.20260314185715.004_20260314_185800 Paper: self.20260314185715.004	Sparse Attention Routing for SSM Recall This benchmark evaluates a hybrid architecture designed to solve the "Needle-in-a-Haystack" retrieval problem often faced by State Space Models (SSMs) like Mamba. Hypothesis While SSMs excel at efficient reasoning over long sequences (low e...	03-14 18:58	Success	-	View
exp_pytrain.20260314185436.003_20260314_185501 Paper: pytrain.20260314185436.003	Robust Dynamic Plugin Loader with Protocol Enforcement This coding benchmark tests the ability to construct a robust, type-safe plugin architecture using Python's standard library. It focuses on combining `typing.Protocol` for interface definition and `importlib` for runtime module loading to c...	03-14 18:55	Success	-	View
exp_self.20260314185233.003_20260314_185308 Paper: self.20260314185233.003	Tiered SSM State Cache Benchmark Innovation This benchmark tests a Tiered SSM State Cache mechanism. Hypothesis: Offloading older SSM states to system RAM (at FP16) while keeping active states in GPU VRAM (at FP8) will allow for effectively infinite context windows...	03-14 18:53	Success	-	View
exp_self.20260314184953.002_20260314_185025 Paper: self.20260314184953.002	Delta-State Compression for Long Context This benchmark implements a simulation of State Space Model (SSM) state caching to verify the Delta-State Compression hypothesis. Hypothesis SSM states evolve smoothly over time (governed by decay factors like $A \bar{H}$). Therefore, s...	03-14 18:50	Success	-	View
exp_pytrain.20260314184735.002_20260314_184807 Paper: pytrain.20260314184735.002	PEP 695 Generic Repository Implementation Benchmark Overview This coding drill verifies your ability to utilize PEP 695 Type Parameter Syntax introduced in Python 3.12. The Challenge You must implement a generic in-memory Repository within `benchmark.py`. The implementation is strictly c...	03-14 18:48	Success	-	View
exp_self.20260314184516.001_20260314_184550 Paper: self.20260314184516.001	CPU-Offloaded Tiered State Cache Paper ID: self.20260314184516.001 - Hypothesis: Distant states in an SSM have diminishing impact on the immediate next token. Quantizing and moving them to system RAM frees up GPU VRAM, allowing for significantly longer context windows with...	03-14 18:45	Success	-	View
exp_2603.12254v1_20260314_184330 Paper: 2603.12254v1	This benchmark implements a synthetic simulation of the AutoGaze architecture to compare a standard ViT (Baseline) again... AutoGaze Efficiency Benchmark This repository contains a synthetic benchmark designed to evaluate the efficiency claims of AutoGaze (Attend Before Attention). It simulates the heavy computational load of processing long, high-resolution...	03-14 18:43	Success	-	View
exp_pytrain.20260314184102.001_20260314_184126 Paper: pytrain.20260314184102.001	Type-Safe Local Package Validator A Python coding drill benchmark designed to test your ability to create robust, type-safe package management tools. Objective Create a CLI script `validate_and_install.py` (simulated within `benchmark.py`) that verifies a local library's ty...	03-14 18:41	Success	-	View
exp_self.20260314183733.004_20260314_183757 Paper: self.20260314183733.004	Tiered SSM State Cache Benchmark This benchmark tests the hypothesis that offloading older SSM (State Space Model) states to system RAM while keeping active states in GPU VRAM allows for effectively infinite context windows on consumer hardware. Benchmark Details The code...	03-14 18:38	Success	-	View
exp_pytrain.20260314183557.018_20260314_183618 Paper: pytrain.20260314183557.018	Dynamic Package Construction and Type-Safety Verification This benchmark tests an autonomous system's ability to programmatically scaffold a Python project structure, generate strictly typed source code, and perform runtime verification against a defined `Protocol`. Objectives 1. **Filesystem Oper...	03-14 18:36	Success	-	View
exp_self.20260314183316.003_20260314_183346 Paper: self.20260314183316.003	SSM State Recycling Benchmark This benchmark tests the hypothesis that maintaining the SSM (State Space Model) hidden state across tool execution boundaries improves efficiency (tokens/sec) and reduces context re-processing overhead. The Innovation: Standard LLM wor...	03-14 18:33	Success	-	View
exp_self.20260314183014.002_20260314_183104 Paper: self.20260314183014.002	Dynamic Precision State Skipping Benchmark This benchmark evaluates the "Dynamic Precision State Skipping" hypothesis for Mamba-style State Space Models (SSMs). The core idea is that during fluent generation (low entropy), the state changes slowly, allowing for lower precision (INT4...	03-14 18:31	Success	-	View
exp_pytrain.20260314182827.017_20260314_182856 Paper: pytrain.20260314182827.017	Python Skill Fallback Title: Strictly Typed Modular Data Processor - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 18:28	Success	-	View
exp_gh_Dao-AILab_flash-attention_20260314_182610 Paper: gh_Dao-AILab_flash-attention	Flash Attention Benchmark This benchmark evaluates the performance and memory efficiency of Flash Attention compared to standard attention mechanisms in transformer models. What is Flash Attention? Flash Attention is a fast and memory-efficient exact attention algor...	03-14 18:26	Success	-	View
exp_hf_2603.08258_20260314_182342 Paper: hf_2603.08258	WaDi: Weight Direction-aware Distillation for One-step Image Synthesis Paper ID: hf_2603.08258 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	03-14 18:23	Success	-	View
exp_pytrain.20260314182119.016_20260314_182137 Paper: pytrain.20260314182119.016	Runtime Type-Checked Dynamic Plugin Loader This benchmark evaluates the capability of a Python system to simulate a packaging environment by programmatically generating a Python module, persisting it to disk, and dynamically importing it using `importlib` machinery. The core challen...	03-14 18:21	Success	-	View
exp_gh_vllm-project_vllm_20260314_180402 Paper: gh_vllm-project_vllm	vLLM Inference Benchmark This benchmark evaluates the performance of vLLM, a high-throughput and memory-efficient inference engine for Large Language Models (LLMs). vLLM introduces PagedAttention, an algorithm that optimizes memory management for the KV cac...	03-14 18:19	Success	-	View
exp_pytrain.20260314180137.015_20260314_180205 Paper: pytrain.20260314180137.015	Dynamic Type-Safe Package Generator Benchmark Overview This benchmark evaluates a system's ability to programmatically construct a valid Python package structure on the filesystem, populate it with source code adhering to modern typing standards (specifically PEP 695 Type Parameter Syn...	03-14 18:02	Success	-	View
exp_self.20260314175900.001_20260314_175926 Paper: self.20260314175900.001	Adaptive Tool-State Quantization (ATSQ) Benchmark This repository contains a runnable benchmark for the "Adaptive Tool-State Quantization" innovation. It tests the hypothesis that selectively applying 4-bit quantization to a State Space Model's (SSM) hidden state only during tool-use tra...	03-14 17:59	Success	-	View
exp_hf_2603.10604_20260314_175718 Paper: hf_2603.10604	HyPER-GAN Benchmark This benchmark evaluates the real-time inference capabilities of the HyPER-GAN architecture simulation, focusing on memory efficiency and patch throughput. Key Metrics * VRAM_USAGE: Peak GPU memory consumed during the patch-enhancement...	03-14 17:57	Success	-	View
exp_pytrain.20260314175435.014_20260314_175521 Paper: pytrain.20260314175435.014	Strictly-Typed Plugin Loader with Entry Point Simulation Overview This benchmark tests a developer's ability to implement a robust plugin architecture using Python's standard library. Specifically, it evaluates the use of `typing.Protocol` for defining structural interfaces (`SupportsProcess`) an...	03-14 17:55	Success	-	View
exp_2308.04657v1_20260314_175319 Paper: 2308.04657v1	Benchmarking Token Reduction in Vision Transformers (ViTs) Architecture: Investigates token reduction in Vision Transformers (ViTs) across 10 methods, contrasting dynamic pruning against fixed spatial patterns. Memory Footprint: Token pruning reduces sequence length within self-attention la...	03-14 17:53	Success	-	View
exp_2308.01045v2_20260314_175232 Paper: 2308.01045v2	Benchmark for Dynamic Token Pruning (DToP) in Vision Transformers Architecture: Introduces Dynamic Token Pruning (DToP) for plain Vision Transformers (ViTs). It employs a multi-stage architecture with auxiliary classifiers to grade token difficulty. Instead of dropping tokens (which harms dense output...	03-14 17:52	Success	-	View
exp_2409.08464v2_20260314_175146 Paper: 2409.08464v2	This benchmark evaluates the VLTP (Vision Language Guided Token Pruning) framework, specifically investigating the h... Architecture: VLTP inserts a trainable "pruning decoder" into the ViT pipeline. This module fuses image tokens with Vision-Language guidance (from an MLLM) to predict token relevance. Only tokens identified as pertinent to the specific...	03-14 17:51	Success	-	View
exp_2512.14332v1_20260314_175050 Paper: 2512.14332v1	Step-Tagging Framework Benchmark Architecture: The paper proposes "Step-Tagging," a framework utilizing a lightweight, auxiliary sentence-classifier alongside the host Language Reasoning Model (LRM). It introduces "ReasonType," a specific taxonomy for categorizing reas...	03-14 17:50	Success	-	View
exp_2504.01690v2_20260314_175010 Paper: 2504.01690v2	Backfill Candidate 2504.01690v2 Architecture: Adapts TopK token pruning to ViT-based audio encoders (AudioMAE, AST) processing Mel-spectrograms. Memory & Speed: Achieves a 30-40% reduction in Multiply-Accumulate (MAC) operations with <1% accuracy drop. Reducing to...	03-14 17:50	Success	-	View
exp_pytrain.20260314174817.013_20260314_174837 Paper: pytrain.20260314174817.013	Strictly-Typed Modular Configuration System Overview This benchmark challenges the developer to construct a robust, modular configuration loader and inference engine simulator using Python's advanced type-hinting capabilities. The goal is to enforce strict interface contracts using `...	03-14 17:48	Success	-	View
exp_2505.21375v2_20260314_174704 Paper: 2505.21375v2	Backfill Candidate 2505.21375v2 Architecture: Built on the LLaVA framework, specifically modified for remote sensing (RS). It introduces Background Token Pruning and Anchored Token Selection to address the "token explosion" typical in ultra-high-res inputs. Th...	03-14 17:47	Success	-	View
exp_2302.06015v3_20260314_174626 Paper: 2302.06015v3	Benchmark: Token Sparsification in Shallow ViTs Summary for ARES 8GB Roadmap Architecture: The paper provides a theoretical framework for a shallow ViT architecture, specifically a single self-attention layer followed by a 2-layer MLP. Memory Footprint & Inference Speed:...	03-14 17:46	Success	-	View
exp_2506.07138v1_20260314_174543 Paper: 2506.07138v1	Spatial Token Fusion (STF) Benchmark Architecture: Proposes Spatial Token Fusion (STF) to merge adjacent spatial tokens, drastically shortening the visual sequence. It is augmented by Multi-Block Token Fusion (MBTF), which injects multi-granularity features to pres...	03-14 17:45	Success	-	View
exp_2307.13770v1_20260314_174457 Paper: 2307.13770v1	Backfill Candidate 2307.13770v1 Architecture E^2VPT implements a dual-prompt strategy to freeze backbone weights. It introduces learnable visual tokens at the input layer and injects learnable Key-Value (KV) pairs directly into the self-attention mechanisms of transfo...	03-14 17:45	Success	-	View
exp_2307.10780v2_20260314_174404 Paper: 2307.10780v2	Benchmark: Learned Threshold Masking Pruning (LTMP) on ViT Architecture: LTMP integrates learned threshold masking modules into Vision Transformers (ViTs). These modules dynamically route tokens—deciding between merging (similarity-based grouping) or pruning (dropping)—to optimize sequence leng...	03-14 17:44	Success	-	View
exp_pytrain.20260314174209.012_20260314_174232 Paper: pytrain.20260314174209.012	Python Skill Fallback Title: Robust Typed Dependency Container - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 17:42	Success	-	View
exp_2402.02554v2_20260314_174057 Paper: 2402.02554v2	DeSparsify: Adversarial DoS Benchmark for Vision Transformers Paper: DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers Summary for ARES 8GB Roadmap: * Architecture: Targets Vision Transformers (ViTs) utilizing dynamic token sparsification mechani...	03-14 17:41	Success	-	View
exp_2409.10197v2_20260314_174005 Paper: 2409.10197v2	Benchmark: FitPrune - Training-Free Visual Token Pruning for MLLMs Architecture: FitPrune is a training-free, statistical pruning method for MLLMs (e.g., LLaVA). Instead of dynamic evaluation, it generates a static "pruning recipe" by analyzing attention map distributions on a small calibration batch....	03-14 17:40	Success	-	View
exp_2505.15816v1_20260314_173916 Paper: 2505.15816v1	Benchmark: ProxyV Vision Token Bypass Architecture ProxyV introduces lightweight "proxy vision tokens" into the LLM backbone. While original vision tokens are preserved to prevent information loss, the proxy tokens handle the heavy lifting (Self-Attention and FFNs). Origina...	03-14 17:39	Success	-	View
exp_2510.07974v2_20260314_173826 Paper: 2510.07974v2	Adaptive World Model Benchmark Architecture: Proposes a wrapper mechanism ("Adaptive World Model") that constructs a dynamic textual world model to track entity states and timelines. It monitors the LLM’s reasoning trajectory for specific "confusion indicators" (e.g....	03-14 17:38	Success	-	View
exp_hf_2603.06854_20260314_173739 Paper: hf_2603.06854	Backfill Candidate hf_2603.06854 Architecture Proposes an inference-time activation steering mechanism to mitigate "text dominance" in Large Audio-Language Models (LALMs). It utilizes mechanistic interpretability to identify specific "audio-specialist" attention heads...	03-14 17:37	Success	-	View
exp_pytrain.20260314173532.011_20260314_173558 Paper: pytrain.20260314173532.011	AST-Driven Type-Aware ZipApp Builder Overview This benchmark tests an autonomous coding system's ability to leverage Python's `ast` module for static analysis and the `zipfile` module for packaging. The task is to implement a `StrictZipAppBuilder` class that enforces a "strict...	03-14 17:36	Success	-	View
exp_2511.20683v1_20260314_172837 Paper: 2511.20683v1	README: Dynamic Template Selection (DTS) Router Benchmark Architecture: Proposes a lightweight MLP router for Dynamic Template Selection (DTS) to classify query complexity and map inputs to optimized response templates. This contrasts with a heavier fine-tuned RoBERTa baseline. **Memory Fo...	03-14 17:35	Success	-	View
exp_2307.02321v2_20260314_172749 Paper: 2307.02321v2	Backfill Candidate 2307.02321v2 Architecture: MSViT proposes a dynamic mixed-scale tokenization scheme using a lightweight, conditional gating mechanism. This module selects optimal token scales per image region, functioning as a preprocessing layer that is agnostic t...	03-14 17:27	Success	-	View
exp_2403.14047v2_20260314_172650 Paper: 2403.14047v2	Backfill Candidate 2403.14047v2 Architecture: Proposes a hybrid pruning approach combining static structured block pruning (weights) with dynamic token pruning (input-dependent). A specialized training algorithm recovers accuracy, while the hardware design utilizes mu...	03-14 17:26	Success	-	View
exp_2408.17062v1_20260314_172555 Paper: 2408.17062v1	Benchmark: VoMix for Vision Transformers (ViT) Analysis for ARES 8GB Roadmap: VoMix * Architecture: A plug-and-play, parameter-free module inserted between ViT blocks. It uses a "Vote" mechanism (layer-wise similarity voting) to identify redundant tokens and a "Mix" operation to...	03-14 17:26	Success	-	View
exp_pytrain.20260314172353.010_20260314_172416 Paper: pytrain.20260314172353.010	Asynchronous Dependency Resolution Engine This benchmark tests your ability to build a robust, type-safe Python application using the standard library. The task is to implement a simplified package dependency resolver that utilizes `asyncio` for concurrent I/O operations and strict...	03-14 17:24	Success	-	View
exp_2407.10756v2_20260314_172237 Paper: 2407.10756v2	GTPT Token Pruning Efficiency Benchmark Architecture GTPT is a coarse-to-fine Transformer designed for efficient human pose estimation. It dynamically introduces keypoints and processes them via "Multi-Head Group Attention" (MHGA). To optimize efficiency, the architecture gro...	03-14 17:22	Success	-	View
exp_2507.08806v1_20260314_172153 Paper: 2507.08806v1	Benchmark: Structure-Aware Pruning for KV Cache Optimization Architecture: Proposes "Structure-Aware Pruning," an inference-time method that injects temporary "end-of-thinking" instructions. It analyzes attention patterns relative to these markers to identify and evict low-contributing reasoning...	03-14 17:21	Success	-	View
exp_2506.07077v1_20260314_172103 Paper: 2506.07077v1	Dual-Priv Pruning: Visual Token Optimization Benchmark Architecture: Dual-Priv Pruning targets Multimodal LLMs (MLLMs) by combining two distinct mechanisms: (1) Visual Token Pruning, which reduces input dimensionality by discarding redundant visual information, and (2) **Gradient-Update...	03-14 17:21	Success	-	View
exp_2505.22411v2_20260314_172025 Paper: 2505.22411v2	Backfill Candidate 2505.22411v2 Architecture "Manifold Steering" is an inference-time intervention, not a structural change. It identifies a low-dimensional manifold within the model's activation space responsible for redundant deliberation loops. By projecting steeri...	03-14 17:20	Success	-	View
exp_2505.19536v3_20260314_171938 Paper: 2505.19536v3	This repository contains a synthetic benchmark to evaluate the efficacy of the FlowCut optimization strategy for Large V... Architecture FlowCut is an information-flow-aware pruning framework for LVLMs. Unlike static methods relying on single-layer attention, FlowCut tracks progressive token interactions across layers using the CLS token as a relay. This dyn...	03-14 17:19	Success	-	View
exp_pytrain.20260314171739.009_20260314_171803 Paper: pytrain.20260314171739.009	Dynamic Package Inspector This benchmark evaluates an autonomous agent's ability to programmatically inspect, validate, and introspect local Python packages using only the Python Standard Library. Objective Create a robust script that defines a function `analyze_pac...	03-14 17:18	Success	-	View
exp_2505.17020v2_20260314_171633 Paper: 2505.17020v2	CrossLMM Architecture Benchmark Architecture: CrossLMM decouples long video sequences via a dual cross-attention mechanism. It first applies aggressive pooling to pretrained visual encoder outputs. Within the LLM layers, it utilizes a Visual-to-Visual cross-attention...	03-14 17:16	Success	-	View
exp_2505.12509v2_20260314_171550 Paper: 2505.12509v2	Benchmark: Proxy Framework Efficiency (Backfill 2505.12509v2) Architecture: Introduces a Proxy Framework that trains smaller, efficient models to approximate the decision boundaries of large "oracle" LLMs. It employs a "screen-and-apply" statistical mechanism to verify local alignment betw...	03-14 17:15	Success	-	View
exp_2505.10118v2_20260314_171512 Paper: 2505.10118v2	Multi-Objective Balanced Covering (MoB) Benchmark Architecture: Multi-Objective Balanced Covering (MoB). This method formulates visual token pruning as a bi-objective covering problem. It balances prompt alignment and visual preservation using Hausdorff distance bounds and $\epsilon$-c...	03-14 17:15	Success	-	View
exp_2504.10854v1_20260314_171434 Paper: 2504.10854v1	Backfill Candidate 2504.10854v1 Summary for ARES 8GB Roadmap: LVLM_CSP * Architecture: LVLM_CSP is a training-free inference accelerator designed for LVLMs performing reasoning segmentation. It utilizes a three-stage pipeline: 1. Clustering: Performs coars...	03-14 17:14	Success	-	View
exp_2504.04653v2_20260314_171358 Paper: 2504.04653v2	Backfill Candidate 2504.04653v2 Architecture: LEO-MINI introduces two core components: Conditional Token Reduction (CoTR) and a Mixture of Multi-Modal Experts (MMoE). CoTR compresses long visual sequences into compact sets using cross-attention between visual...	03-14 17:14	Success	-	View
exp_2503.23459v1_20260314_171313 Paper: 2503.23459v1	Backfill Candidate 2503.23459v1 Architecture: Proposes "RL4EViT," replacing static pruning heuristics with Multi-Agent Proximal Policy Optimization (MAPPO). Token pruning is formulated as a Markov Game where individual agents (tokens) make collaborative, layer-wise de...	03-14 17:13	Success	-	View
exp_pytrain.20260314171133.008_20260314_171149 Paper: pytrain.20260314171133.008	Strictly Typed Dependency Graph Inspector Objective Design and implement a robust, type-safe CLI utility script named `pkg_inspector.py` (simulated within the benchmark logic) that analyzes the current Python runtime environment. The solution must demonstrate proficiency with moder...	03-14 17:11	Success	-	View
exp_2511.12267v1_20260314_170022 Paper: 2511.12267v1	Backfill Candidate 2511.12267v1: Active Perception Benchmark Architecture: ZoomEarth introduces an "active perception" framework that processes Ultra-High-Resolution (UHR) images via an adaptive cropping-zooming mechanism. Instead of passively feeding the entire image into a Vision-Language Model...	03-14 17:10	Success	-	View
exp_pytrain.20260314165555.007_20260314_165748 Paper: pytrain.20260314165555.007	Type-Safe Modular Plugin System This benchmark evaluates the ability to dynamically construct a Python package structure that leverages advanced typing features (`typing.Protocol`, `typing.Generic`) to enforce interface compliance without external dependencies. Objective...	03-14 16:57	Success	-	View
exp_2511.10081v1_20260314_165148 Paper: 2511.10081v1	Benchmark for GridPrune (Backfill Candidate 2511.10081v1) Architecture GridPrune replaces standard global Top-K pruning with a two-stage "guide-globally, select-locally" strategy. It uses text-conditional guidance to dynamically allocate token quotas across spatial grids before performing loca...	03-14 16:51	Success	-	View
exp_2510.24214v1_20260314_165108 Paper: 2510.24214v1	SCOPE: Set-Coverage Oriented Visual Token Pruning Benchmark Architecture: SCOPE introduces a visual token pruning strategy for Multimodal LLMs (specifically LLaVA-1.5 and Next) designed to operate prior to the main transformer blocks. Instead of relying solely on attention-based saliency, SCOPE...	03-14 16:51	Success	-	View
exp_2510.17205v1_20260314_165015 Paper: 2510.17205v1	Backfill Candidate 2510.17205v1 Architecture & Dynamics VisiPruner leverages a discovered "three-stage" cross-modal fusion process: visual tokens act as passive attention sinks in shallow layers, drive abrupt fusion in middle layers, and are discarded in deep layers....	03-14 16:50	Success	-	View
exp_2303.08685v2_20260314_164917 Paper: 2303.08685v2	STViT Benchmark Suite Architecture: STViT replaces standard dense patch tokens with sparse "semantic tokens" acting as cluster centers. Initialized via spatial pooling and refined through attention, these tokens compress global or local information. It suppo...	03-14 16:49	Success	-	View
exp_pytrain.20260314164528.006_20260314_164707 Paper: pytrain.20260314164528.006	Type-Safe Dynamic Component Registry Overview This benchmark tests the ability to construct a robust, dependency-free component registry using Python's standard library. The design mirrors patterns found in high-performance ML frameworks like Hugging Face Diffusers and vLLM. F...	03-14 16:47	Success	-	View
exp_2403.17411v1_20260314_163254 Paper: 2403.17411v1	PCToolkit (2403.17411v1) Benchmark Architecture: PCToolkit proposes a modular, unified framework designed as a plug-and-play solution for LLMs. It integrates various cutting-edge prompt compression algorithms into a single interface, abstracting the complexity of differe...	03-14 16:42	Success	-	View
exp_2511.21477v1_20260314_163212 Paper: 2511.21477v1	Backfill Candidate 2511.21477v1 Architecture The proposed method introduces a frequency-aware token reduction module within the self-attention mechanism. It partitions tokens into high-frequency (detail-oriented) and low-frequency (structural/background) groups. High-...	03-14 16:32	Success	-	View
exp_2401.01470v2_20260314_163133 Paper: 2401.01470v2	Backfill Candidate 2401.01470v2 Architecture TPC-ViT introduces a Token Propagation Controller (TPC) module to optimize token lifecycle management. Unlike static pruning methods, TPC employs a probabilistic approach using "pause" (reduction) and "restart" (reuse) dist...	03-14 16:31	Success	-	View
exp_2511.16449v3_20260314_163041 Paper: 2511.16449v3	Benchmark for VLA-Pruner: Dual-Level Token Pruning for VLAs Architecture: VLA-Pruner is a plug-and-play module designed for Vision-Language-Action (VLA) models. It introduces a dual-level pruning strategy that deviates from standard VLM methods by considering action execution. It calculates toke...	03-14 16:30	Success	-	View
exp_2504.04024v1_20260314_163004 Paper: 2504.04024v1	WiCo (Window Concatenation) Optimization Benchmark Architecture: Utilizes a sliding window to concatenate spatially adjacent visual tokens. To prevent detail loss, the last layers of the vision encoder are fine-tuned to align features within windows. The "WiCo+" variant further decompos...	03-14 16:30	Success	-	View
exp_pytrain.20260314162814.005_20260314_162831 Paper: pytrain.20260314162814.005	Python Skill Fallback Title: Strictly-Typed Plugin System - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 16:28	Success	-	View
exp_2512.13438v1_20260314_162736 Paper: 2512.13438v1	Backfill Candidate 2512.13438v1 Architecture: UIFormer optimizes LLM agents by synthesizing UI transformation programs via a Domain-Specific Language (DSL). It utilizes constraint-based optimization and iterative LLM refinement to compress complex UI trees into semant...	03-14 16:27	Success	-	View
exp_2510.08483v1_20260314_162608 Paper: 2510.08483v1	DeepPrune Architecture Benchmark Architecture: DeepPrune introduces a specialized "Judge" model (trained via focal loss) to evaluate partial Chain-of-Thought traces. It uses an online greedy clustering algorithm to dynamically prune redundant reasoning paths before gen...	03-14 16:26	Success	-	View
exp_2505.16122v3_20260314_162339 Paper: 2505.16122v3	Plan-and-Budget (P&B) Inference Benchmark Architecture Introduces Plan-and-Budget (P&B), a model-agnostic, test-time framework that decomposes complex queries into sub-questions. A controller dynamically allocates token budgets based on estimated uncertainty, solving the "o...	03-14 16:25	Success	-	View
exp_pytrain.20260314162148.004_20260314_162209 Paper: pytrain.20260314162148.004	Python Skill Fallback Title: Type-Safe Plugin Loader with Protocol Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 16:22	Success	-	View
exp_2504.17996v1_20260314_162043 Paper: 2504.17996v1	Backfill Candidate 2504.17996v1 Architecture LVTP is a "plug-and-play" progressive token pruning wrapper for Vision Transformers (ViTs). It introduces a dynamic scoring mechanism that fuses multi-scale Tsallis entropy with low-level visual features (specifically edge...	03-14 16:20	Success	-	View
exp_2512.17920v1_20260314_161954 Paper: 2512.17920v1	Backfill Candidate 2512.17920v1 Paper Focus: Evaluation of LLM instruction-following robustness under prompt compression. * Architecture: No new model proposed. Evaluates 9 frontier LLMs, finding reasoning models are 27.5% more robust to compression than effic...	03-14 16:19	Success	-	View
exp_2511.20439v1_20260314_161851 Paper: 2511.20439v1	OC-VTP Benchmark Architecture: OC-VTP introduces a lightweight, plug-and-play pruner module positioned upstream of the LLM backbone. It utilizes a small, pre-trained network to select "object-centric" vision tokens by minimizing the reconstruction error...	03-14 16:19	Success	-	View
exp_2505.00019v1_20260314_161805 Paper: 2505.00019v1	Backfill Candidate 2505.00019v1 Architecture: This study evaluates six distinct prompt compression algorithms (e.g., structural pruning, token summarization) designed to preprocess inputs before feeding them to the LLM, rather than modifying the model weights themselv...	03-14 16:18	Success	-	View
exp_hf_2603.10178_20260314_161721 Paper: hf_2603.10178	Backfill Candidate hf_2603.10178 Architecture: ExeVRM is an 8B parameter Vision-Language Model (VLM) fine-tuned on the ExeVR-53k dataset to classify computer-use task success from video keyframes. Its key innovation is spatiotemporal token pruning, a mechanism that...	03-14 16:17	Success	-	View
exp_pytrain.20260314161523.003_20260314_161545 Paper: pytrain.20260314161523.003	Python Skill Fallback Title: Runtime Module Loader with Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 16:15	Success	-	View
exp_2512.07580v2_20260314_161243 Paper: 2512.07580v2	Backfill Candidate 2512.07580v2: Information Horizon Benchmark Architecture: Identifies an "information horizon" in VLLMs where visual token salience vanishes (typically beyond layer 20). The paper proves that in deep layers, token information becomes uniform, rendering complex, attention-based pru...	03-14 16:14	Success	-	View
exp_2511.14293v1_20260314_161143 Paper: 2511.14293v1	Benchmark for Segmentwise Pruning in Audio-Language Models Architecture The paper proposes segmentwise pruning, a token selection strategy tailored for Audio-Language Models (ALMs). Unlike generic vision approaches, this method accounts for the time dimension of audio, pruning irrelevan...	03-14 16:11	Success	-	View
exp_2505.08058v2_20260314_161102 Paper: 2505.08058v2	Semantic Hypernym Compression Benchmark Architecture: Introduces a pre-processing text compression engine that utilizes word-level semantic constriction. It replaces specific nouns with their hypernyms (broader category terms) to drastically shorten sequences, relying on...	03-14 16:11	Success	-	View
exp_pytrain.20260314160913.002_20260314_160930 Paper: pytrain.20260314160913.002	Python Reliability Drill: PEP 695 Type Parameter Syntax Overview This drill validates your ability to utilize modern Python typing features introduced in PEP 695 (Type Parameter Syntax). You must implement a generic data processing pipeline that handles various data types strictly, using the new...	03-14 16:09	Success	-	View
exp_2510.11588v1_20260314_155755 Paper: 2510.11588v1	Benchmark: CAP-CPT Inference Efficiency vs. RAG Architecture Introduces CAP-CPT (Category-Aware Policy Continued Pretraining), a training pipeline that moves policy knowledge from the context window into model weights. It parses policy documents into categories (factual, behavior...	03-14 16:08	Success	-	View
exp_2409.10994v3_20260314_155716 Paper: 2409.10994v3	TRIM: Token Reduction for Efficient VLM Inference Architecture: TRIM proposes a token-pruning strategy situated between the vision encoder and the LLM. It utilizes CLIP similarity metrics to identify and retain salient visual features while discarding redundant tokens, mimicking human...	03-14 15:57	Success	-	View
exp_2511.12281v2_20260314_155636 Paper: 2511.12281v2	Backfill Candidate 2511.12281v2 Architecture: Cmprsr repurposes Qwen3-4B via Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO). It performs abstractive, token-level compression, specifically optimizing for semantic retention and strict adh...	03-14 15:56	Success	-	View
exp_pytrain.20260314155435.001_20260314_155506 Paper: pytrain.20260314155435.001	Structural Subtyping & Dynamic Plugin Loader Benchmark This project demonstrates a robust, type-safe plugin architecture using Python's advanced type hinting features and dynamic module loading. Architecture 1. Protocol Definition (`DataHandler`): We utilize `typing.Protocol` combined with...	03-14 15:55	Success	-	View
exp_2511.12281v2_20260314_152959 Paper: 2511.12281v2	Benchmark: Cmprsr (Qwen3-4B) Memory & Compression Efficiency Architecture: Cmprsr repurposes Qwen3-4B via Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO). It performs abstractive, token-level compression, specifically optimizing for semantic retention and strict adh...	03-14 15:36	Pending	-	View
exp_2504.14692v1_20260314_152911 Paper: 2504.14692v1	OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding Architecture: Utilizes a unified rotary position-adaptive encoder to handle 2D, 3D, and video inputs within a single model, eliminating the architectural overhead and VRAM cost of maintaining separate modality-specific towers. **Mem...	03-14 15:29	Success	-	View
exp_pytrain.20260314152704.051_20260314_152727 Paper: pytrain.20260314152704.051	Strictly Typed Modular Pipeline This benchmark evaluates a Python implementation of a strictly typed data processing pipeline. The system leverages Python's `typing.Protocol`, `typing.TypeVar`, and `typing.Generic` modules to enforce structural subtyping and data integrit...	03-14 15:27	Success	-	View
exp_2505.11707v1_20260314_152540 Paper: 2505.11707v1	Benchmark: Structure-then-Detail Token Merging (SDTM) Architecture SDTM is a post-training token merging technique for Diffusion Transformers (DiT). It exploits "structure-then-detail" denoising priors to identify and prune redundant tokens that the attention mechanism ignores. The archite...	03-14 15:25	Success	-	View
exp_2505.22654v3_20260314_152443 Paper: 2505.22654v3	VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models Architecture: VScan proposes a two-stage visual token reduction framework to handle LVLM bottlenecks: 1. Encoding Stage: Implements token merging via complementary global and local scans. 2. LLM Stage: Introduces pruning at inte...	03-14 15:24	Success	-	View
exp_2403.02991v1_20260314_152357 Paper: 2403.02991v1	MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer Architecture: MADTP introduces two plug-in modules for Vision-Language Transformers (VLTs): 1. MAG (Multi-modality Alignment Guidance): Aligns semantic features across modalities before pruning to ensure tokens critical to both visi...	03-14 15:24	Success	-	View
exp_2403.10030v3_20260314_152309 Paper: 2403.10030v3	Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers Architecture Proposes Multi-criteria Token Fusion (MCTF) to reduce the quadratic complexity of Vision Transformers. Instead of standard pruning, MCTF fuses tokens based on similarity, informativeness, and cluster size. It utilizes "...	03-14 15:23	Success	-	View
exp_pytrain.20260314152106.050_20260314_152126 Paper: pytrain.20260314152106.050	Python Skill Fallback Title: Strictly Typed CLI Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 15:21	Success	-	View
exp_2510.10528v2_20260314_151932 Paper: 2510.10528v2	Merlin's Whisper Benchmark Architecture: Whisper is not a model architecture but a black-box prompting framework. It functions as an inference wrapper that iteratively refines input prompts to persuade LLMs to generate concise responses, bypassing the verbose Cha...	03-14 15:19	Success	-	View
exp_2511.06283v2_20260314_151841 Paper: 2511.06283v2	TinyChemVL: Efficient Chemical Vision-Language Benchmarking Architecture: TinyChemVL is a 4B parameter Vision-Language Model (VLM) optimized for chemical reasoning. It employs a visual token reduction mechanism to filter non-informative backgrounds, focusing processing power on molecular structu...	03-14 15:18	Success	-	View
exp_2503.20540v1_20260314_151753 Paper: 2503.20540v1	Beyond Intermediate States: Explaining Visual Redundancy through Language Summary for ARES 8GB Roadmap * Architecture: Proposes a "Dual-Perspective" pruning mechanism. Instead of relying on intermediate attention maps, it defines redundancy by analyzing textual output variations against visual input pertu...	03-14 15:18	Success	-	View
exp_2505.12359v1_20260314_151710 Paper: 2505.12359v1	Benchmark for STAR: Stage-Wise Attention-Guided Token Reduction Architecture: STAR is a training-free, plug-and-play framework for Large Vision-Language Models (LVLMs) that utilizes a two-stage token reduction strategy. It performs early-stage pruning based on visual self-attention to remove...	03-14 15:17	Success	-	View
exp_pytrain.20260314151449.049_20260314_151513 Paper: pytrain.20260314151449.049	Python Skill Fallback Title: Robust Dynamic Plugin Loader with Runtime Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 15:15	Success	-	View
exp_2505.19217v1_20260314_151331 Paper: 2505.19217v1	The Overthinker's DIET: Benchmarking Efficiency & Performance Architecture DIET is a training framework, not a structural modification, utilizing Reinforcement Learning (RL) to optimize the efficiency-performance trade-off. It employs "Advantage Weighting" to stabilize group-normalized RL (specifi...	03-14 15:13	Success	-	View
exp_2506.00307v2_20260314_151245 Paper: 2506.00307v2	Lossless Token Sequence Compression Benchmark Paper: Lossless Token Sequence Compression via Meta-Tokens Architecture: Proposes a task-agnostic, lossless compression algorithm similar to LZ77. It identifies repeated subsequences within the input context and replaces them with u...	03-14 15:12	Success	-	View
exp_2408.12742v1_20260314_151158 Paper: 2408.12742v1	TReX: Reusing Vision Transformer's Attention for Efficient Xbar-based Computing Architecture TReX proposes a hardware-algorithm co-design for Xbar-based In-Memory Computing (IMC). It optimizes Vision Transformers (ViTs) by strategically reusing attention maps from earlier encoder layers in later layers. This by...	03-14 15:12	Success	-	View
exp_2402.16058v1_20260314_151048 Paper: 2402.16058v1	Gist-COCO Efficiency Benchmark Architecture: Gist-COCO utilizes a trainable "plugin" encoder to compress lengthy input prompts into a small set of "gist" tokens. Crucially, it employs a "gist verbalization" mechanism to translate these compressed representations back...	03-14 15:11	Success	-	View
exp_pytrain.20260314150845.048_20260314_150911 Paper: pytrain.20260314150845.048	Generic Plugin Loader with Entry Point Simulation Overview This coding drill validates a developer's ability to implement a robust, type-safe plugin system using modern Python 3.12 features. The benchmark simulates a simplified packaging environment where "plugins" are discovered and loade...	03-14 15:09	Success	-	View
exp_2304.00341v1_20260314_150724 Paper: 2304.00341v1	JacobiNeRF Memory & Speed Benchmark Architecture JacobiNeRF utilizes a standard NeRF backbone but augments the training process with a second-order regularization objective. It explicitly aligns the Jacobians of correlated scene points to model mutual information, rather...	03-14 15:07	Success	-	View
exp_2510.09085v1_20260314_150640 Paper: 2510.09085v1	FLToP CTC: Frame-Level Token Pruning via Relative Threshold for Efficient and Memory-Saving Decoding on Diverse Platform... Architecture: FLToP CTC optimizes the decoding stage of CTC-based ASR models (e.g., wav2vec 2.0). Rather than exhaustive token computation, it implements frame-level token pruning guided by a relative probability threshold to dynamica...	03-14 15:06	Success	-	View
exp_2511.03929v2_20260314_150554 Paper: 2511.03929v2	Benchmark for NVIDIA Nemotron Nano V2 VL Architecture: Utilizes a hybrid Mamba-Transformer backbone (successor to the 8B Llama-3.1 variant) optimized for multimodal inputs (text, documents, video). It incorporates innovative token reduction techniques to manage long-co...	03-14 15:06	Success	-	View
exp_2511.22235v2_20260314_150509 Paper: 2511.22235v2	Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation Architecture Proposes the Coordinator-Executor-State Tracker (CES) framework to decouple high-level reasoning from execution. The system utilizes a Coordinator for planning, a State Tracker for context compression/history ma...	03-14 15:05	Success	-	View
exp_2512.02700v4_20260314_150426 Paper: 2512.02700v4	Benchmark Design for VLM-Pruner VLM-Pruner optimizes VLMs for memory-constrained hardware via a training-free token pruning mechanism. * Architecture: Introduces a "Centrifugal" pruning paradigm and a Buffering for Spatial Sparsity (BSS) criterion. This ba...	03-14 15:04	Success	-	View
exp_pytrain.20260314150214.047_20260314_150238 Paper: pytrain.20260314150214.047	Type-Safe Generic Registry Benchmark This benchmark evaluates the implementation of a robust, type-safe plugin registry system using Python's advanced type hinting features (`typing.Protocol`, `typing.Generic`, and `runtime_checkable`). Objective Create a generic `Registry` cl...	03-14 15:02	Success	-	View
exp_2505.20100v1_20260314_150041 Paper: 2505.20100v1	AdaTP: Attention-Debiased Token Pruning for Video Large Language Models Architecture AdaTP is a training-free token pruning pipeline for Video LLMs. It addresses the redundancy in visual tokens by correcting two specific biases in standard attention scores: global bias (over-focusing on temporal sequence en...	03-14 15:00	Success	-	View
exp_2506.12707v1_20260314_145947 Paper: 2506.12707v1	SecurityLingua Benchmark Architecture: Utilizes a dual-stage pipeline comprising a lightweight "Intent Compressor" and the Target LLM. The compressor extracts the true intent (detecting malicious payloads) and injects this analysis into the system prompt, while...	03-14 15:00	Success	-	View
exp_2506.16369v2_20260314_145859 Paper: 2506.16369v2	Prompt-based Dynamic Token Pruning (PrATo) Benchmark Architecture PrATo introduces a dynamic token pruning layer for Vision Transformers (ViTs). It utilizes a spatial prompt to generate a prior that ranks tokens by relevance. Low-relevance tokens are down-weighted and excluded from proces...	03-14 14:59	Success	-	View
exp_2407.02043v1_20260314_145809 Paper: 2407.02043v1	Concise and Precise Context Compression Benchmark Summary for ARES 8GB Roadmap This paper introduces a context compression framework designed to reduce the memory overhead of API documentation for tool-using LLMs. Architecture: The approach utilizes a dual-strategy mechanism. **Sel...	03-14 14:58	Success	-	View
exp_2407.15504v2_20260314_145727 Paper: 2407.15504v2	Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Models Architecture: Formalizes token-level prompt compression via a Rate-Distortion (R-D) framework, deriving theoretical performance limits using Linear Programming (LP). Memory Footprint: Significantly reduces input token counts, direct...	03-14 14:57	Success	-	View
exp_pytrain.20260314145504.046_20260314_145537 Paper: pytrain.20260314145504.046	Extensible Command Registry with Protocol Enforcement Overview This coding drill demonstrates a robust, modular command-line framework built entirely with the Python standard library. It simulates advanced packaging concepts such as namespace separation and entry-point discovery using `typing....	03-14 14:55	Success	-	View
exp_2407.19410v1_20260314_144335 Paper: 2407.19410v1	AdaCoder Benchmark Suite Architecture: A lightweight wrapper for Visual Programmatic Models (e.g., ViperGPT). It uses a question-type classifier to retrieve task-specific, compressed "pre-prompts" containing only relevant API definitions, filtering out unnecess...	03-14 14:53	Success	-	View
exp_2510.22963v3_20260314_144249 Paper: 2510.22963v3	Benchmark for CompressionAttack: Semantic Drift and Performance Evaluation Architecture: Focuses on the prompt compression module within LLM agent pipelines. Introduces CompressionAttack, which exploits compression layers via HardCom (discrete adversarial edits) and SoftCom (latent-space pertur...	03-14 14:42	Success	-	View
exp_2511.15098v1_20260314_144207 Paper: 2511.15098v1	README: Benchmarking Visual Token Redundancy in dMLLMs Analysis: Visual Token Redundancy in Discrete Diffusion MLLMs This paper investigates optimization strategies for discrete diffusion-based Multimodal LLMs (dMLLMs) to address the computational overhead of full-sequence attention dur...	03-14 14:42	Success	-	View
exp_2511.19928v1_20260314_144124 Paper: 2511.19928v1	Benchmark: Context-Aware Token Pruning and Discriminative Attention (CPDATrack) Architecture: CPDATrack optimizes one-stream Vision Transformer (ViT) trackers via two key mechanisms: 1) A learnable Token Pruning Module positioned between encoder layers that estimates target probabilities and discards low-probab...	03-14 14:41	Success	-	View
exp_2505.23617v2_20260314_144030 Paper: 2505.23617v2	Grounded Video Tokenization (TrajViT) Benchmark Architecture: TrajViT replaces standard space-time patches with panoptic sub-object trajectories, generating a single token per semantic object track rather than per grid block. Memory & Speed: Achieves 10x token reduction a...	03-14 14:40	Success	-	View
exp_pytrain.20260314143815.045_20260314_143836 Paper: pytrain.20260314143815.045	Robust Dynamic Plugin Loader with Runtime Type Enforcement Overview This coding drill focuses on advanced Python metaprogramming, specifically dynamic module loading and structural subtyping (protocols). The objective is to build a self-contained system that acts as a strict plugin loader, verifyin...	03-14 14:38	Success	-	View
exp_2408.03094v1_20260314_142646 Paper: 2408.03094v1	Benchmark: 500xCompressor Efficiency Simulation Architecture: 500xCompressor is a lightweight encoder (pretrained on Arxiv) that compresses long text sequences into single special tokens. It uniquely relies on Key-Value (KV) preservation rather than embeddings to maintain semantic in...	03-14 14:36	Success	-	View
exp_2409.01227v3_20260314_142559 Paper: 2409.01227v3	Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference Architecture Proposes Context-Aware Prompt Compression (CPC), utilizing a contrastive learning-based sentence encoder. The model scores sentence relevance against the specific query, filtering irrelevant data at the sentence level rathe...	03-14 14:26	Success	-	View
exp_2308.08758v3_20260314_142507 Paper: 2308.08758v3	Benchmarking Discrete Prompt Compression with Reinforcement Learning (PCRL) Architecture: PCRL introduces a lightweight, RL-trained policy network that performs discrete token-level editing (deletion/substitution) on prompts. It treats the LLM as a black-box environment, requiring no gradient access or labeled...	03-14 14:25	Success	-	View
exp_pytrain.20260314142317.044_20260314_142335 Paper: pytrain.20260314142317.044	Robust Typed Plugin Loader A coding drill benchmark focusing on Python's `typing` module, specifically `Protocol` and `runtime_checkable`, combined with dynamic module loading using `importlib`. Objective Implement a robust runtime plugin loader that: 1. Dynamically...	03-14 14:23	Success	-	View
exp_2406.18294v2_20260314_141159 Paper: 2406.18294v2	Benchmark: Hierarchical Context Pruning (HCP) Architecture: Hierarchical Context Pruning (HCP) is a context-management strategy, not a model weight modification. It parses repositories into a function-level dependency graph. The architecture retains topological file dependencies an...	03-14 14:22	Success	-	View
exp_2510.14393v1_20260314_141100 Paper: 2510.14393v1	Benchmark for Low Power Vision Transformer Accelerator Architecture Shifts optimization focus from self-attention to the Feed-Forward Network (FFN), identified as the bottleneck for short-token Vision Transformers. Implements algorithm-hardware co-design using dynamic token pruning and repl...	03-14 14:11	Success	-	View
exp_pytrain.20260314140855.043_20260314_140911 Paper: pytrain.20260314140855.043	Typed Module Scaffolder & Validator Objective This benchmark tests your ability to construct robust Python filesystem utilities using modern type annotations. You will implement a lightweight package scaffolder that leverages `typing.TypedDict` for metadata definitions and `p...	03-14 14:09	Success	-	View
exp_2510.27135v1_20260314_140730 Paper: 2510.27135v1	Benchmark Design: E-MMDiT Efficiency Analysis Architecture: E-MMDiT is a 304M parameter Multimodal Diffusion Transformer (MMDiT) optimized for token efficiency. It employs a highly compressive visual tokenizer and a multi-path compression module to reduce sequence length. Key innov...	03-14 14:07	Success	-	View
exp_2511.08128v1_20260314_140647 Paper: 2511.08128v1	Sentence-Anchored Gist Compression for Long-Context LLMs Architecture: Introduces "Sentence-Anchored Gist Compression," utilizing learned compression tokens integrated into pre-trained LLMs via fine-tuning. Memory Footprint: Significantly reduces KV cache storage and memory bandwidth. Val...	03-14 14:06	Success	-	View
exp_2511.15244v2_20260314_140607 Paper: 2511.15244v2	Context Cascade Compression (C3) Benchmark Architecture: C3 utilizes a cascaded design: a small "Compressor" LLM encodes long contexts into fixed-length latent vectors (e.g., 32–64 tokens), which a large "Decoder" LLM subsequently processes for generation. Memory Footprint:...	03-14 14:06	Success	-	View
exp_2512.12560v1_20260314_140525 Paper: 2512.12560v1	StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding Architecture: StreamingAssistant optimizes Multimodal LLMs for video via a token pruning framework. It introduces the MSSAVT metric to evaluate spatial redundancy and employs a "masked pruning strategy" to remove mutually unadjacent tok...	03-14 14:05	Success	-	View
exp_2504.04787v1_20260314_140437 Paper: 2504.04787v1	Dynamic Vision Mamba (DyVM) Efficiency Benchmark Architecture: DyVM optimizes Mamba vision backbones by addressing spatial redundancy via Dynamic Token Merging (rearranging pruned sequences before SSM layers to prevent training-inference mismatch) and Dynamic Block Skipping (s...	03-14 14:04	Success	-	View
exp_pytrain.20260314140220.042_20260314_140244 Paper: pytrain.20260314140220.042	Generic Data Container Refactoring using PEP 695 This benchmark evaluates a refactoring of a generic data container utilizing PEP 695 (Type Parameter Syntax). The primary goal is to eliminate boilerplate code associated with `typing.TypeVar` and `typing.Generic`, improving namespace m...	03-14 14:02	Success	-	View
exp_2504.08966v1_20260314_135055 Paper: 2504.08966v1	Benchmark for PACT (Pruning and Clustering-Based Token Reduction) Architecture: PACT optimizes Visual Language Models (VLMs) by deploying a dual-strategy token reduction module at early LLM layers. It utilizes a novel, attention-free importance metric for pruning irrelevant tokens and applies Distance...	03-14 14:00	Success	-	View
exp_2504.11004v1_20260314_135010 Paper: 2504.11004v1	Dynamic Compressing Prompts for Efficient Inference of Large Language Models Architecture LLM-DCP utilizes a reinforcement learning framework where a lightweight policy network (DCP-Agent) treats prompt compression as a Markov Decision Process (MDP). The agent sequentially evaluates and prunes tokens based on a...	03-14 13:50	Success	-	View
exp_pytrain.20260314134747.041_20260314_134821 Paper: pytrain.20260314134747.041	Type-Safe Entrypoint Dispatcher Overview This coding drill demonstrates a robust, type-safe command dispatcher implemented in Python standard library. It leverages `typing.TypedDict` for configuration schema definition and `typing.Protocol` for structural interface enforc...	03-14 13:48	Success	-	View
exp_2505.17827v2_20260314_134613 Paper: 2505.17827v2	Not All Tokens Are What You Need In Thinking Architecture: Introduces Conditional Token Selection (CTS), a token-level compression framework. It utilizes conditional importance scoring to identify and prune non-essential reasoning tokens, training models to generate compressed...	03-14 13:46	Success	-	View
exp_2505.21233v2_20260314_134517 Paper: 2505.21233v2	Benchmark for CROP: Contextual Region-Oriented Visual Token Pruning Architecture CROP introduces a query-driven localization module to identify relevant image regions, followed by a two-stage pruning strategy. It offers Pre-LLM Compression (PLC) for adaptive spatial downsampling and Inner-LLM Pruning (I...	03-14 13:45	Success	-	View
exp_2505.22038v2_20260314_134426 Paper: 2505.22038v2	Balanced Token Pruning (BTP) Benchmark Architecture: Balanced Token Pruning (BTP) is a plug-and-play inference strategy for LVLMs that optimizes vision token reduction. It utilizes a multi-stage approach with a small calibration set to balance local output consistency agains...	03-14 13:44	Success	-	View
exp_2506.05709v1_20260314_134347 Paper: 2506.05709v1	Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration Architecture Proposes a "Token Transforming" framework that unifies token pruning and merging into an explicit matrix transformation operation. By generalizing token reduction as a many-to-many mapping, it preserves more information tha...	03-14 13:43	Success	-	View
exp_2506.10967v2_20260314_134300 Paper: 2506.10967v2	Benchmark: CDPruner for Visual Token Pruning Architecture CDPruner replaces standard attention or similarity-based pruning with a Determinantal Point Process (DPP) algorithm. It calculates "conditional diversity" to select a subset of visual tokens that are both representative of...	03-14 13:43	Success	-	View
exp_pytrain.20260314134051.040_20260314_134111 Paper: pytrain.20260314134051.040	Strictly Typed Plugin Registry with Dynamic Module Discovery Overview This benchmark tests the ability to construct a robust, type-safe plugin system using Python's standard library. It simulates a simplified architecture similar to PyTorch or LitGPT, where model architectures are registered dynamica...	03-14 13:41	Success	-	View
exp_2407.14057v1_20260314_132917 Paper: 2407.14057v1	Benchmark Design: LazyLLM Simulation Architecture: LazyLLM introduces dynamic token pruning within the attention mechanism. Unlike static pruning, it re-evaluates token importance at each generation step, skipping KV cache computation for tokens deemed irrelevant to the im...	03-14 13:39	Success	-	View
exp_2408.08604v5_20260314_132808 Paper: 2408.08604v5	Benchmark for Bi-Directional Deep Contextual Video Compression (DCVC-B) Paper: Bi-Directional Deep Contextual Video Compression (DCVC-B) Architecture: DCVC-B replaces traditional hybrid coding with a deep learning framework optimized for B-frames. It utilizes a bi-directional motion difference context p...	03-14 13:28	Success	-	View
exp_2409.01179v3_20260314_132719 Paper: 2409.01179v3	Recoverable Compression Benchmark Architecture: A training-free, plug-and-play module for Large Multimodal Models (LMMs). It utilizes cross-modal similarity between the textual prompt and visual feature maps to dynamically recover semantically relevant visual tokens whi...	03-14 13:27	Success	-	View
exp_pytrain.20260314132426.039_20260314_132504 Paper: pytrain.20260314132426.039	Python Skill Fallback Title: Generic Model Registry with Strict Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 13:25	Success	-	View
exp_2401.04975v1_20260314_132231 Paper: 2401.04975v1	HaltingVT Benchmark Architecture: HaltingVT modifies Joint Space-Time Video Transformers by introducing a "Glimpser" module that performs adaptive, layer-wise token pruning. It dynamically removes redundant spatial-temporal tokens—specifically targeting mi...	03-14 13:22	Success	-	View
exp_2303.06522v1_20260314_132128 Paper: 2303.06522v1	Token Sparsification for Faster Medical Image Segmentation Architecture: Proposes a Sparse-Completion-Dense (SCD) pipeline to enable token sparsification for segmentation. The method employs Soft-topK Token Pruning (STP) using a lightweight sub-network for differentiable token selection. It...	03-14 13:21	Success	-	View
exp_2510.16092v1_20260314_132035 Paper: 2510.16092v1	Compressing Many-Shots in In-Context Learning Architecture: Introduces MemCom, a layer-wise compression technique for In-Context Learning (ICL). Unlike standard prompt pruning, MemCom utilizes a dedicated compressor network to generate "soft-token" summaries at **every transfor...	03-14 13:20	Success	-	View
exp_2511.10488v1_20260314_131935 Paper: 2511.10488v1	SPOT: Sparsification Benchmark Architecture: SPOT introduces lightweight relevance predictors into standard Vision Transformer (ViT) blocks. These modules analyze token embeddings and inter-layer attention dynamics to identify and prune redundant tokens prior to th...	03-14 13:19	Success	-	View
exp_pytrain.20260314131714.038_20260314_131737 Paper: pytrain.20260314131714.038	Robust Type-Safe Plugin Registry with Runtime Discovery Overview This benchmark implements a modular plugin architecture in pure Python. It demonstrates the utility of Python's `typing.Protocol` for defining structural interfaces (subtyping) and `inspect` for runtime discovery and registration o...	03-14 13:17	Success	-	View
exp_2504.17040v2_20260314_131534 Paper: 2504.17040v2	DyMU: Dynamic Merging and Virtual Unmerging Benchmark Architecture: DyMU optimizes VLMs via two training-free modules: Dynamic Token Merging (DToMe) and Virtual Token Unmerging (VTU). DToMe prunes redundant ViT tokens based on image complexity, while VTU reconstructs attention masks for th...	03-14 13:15	Success	-	View
exp_2303.14526v1_20260314_131433 Paper: 2303.14526v1	Benchmark: Selective Structured State-Spaces (S5) for Video Architecture: S5 (Selective Structured State-Space) improves upon the S4 architecture by introducing a lightweight mask generator. This module adaptively prunes redundant image tokens, avoiding the quadratic complexity of dense self...	03-14 13:14	Success	-	View
exp_2511.18920v1_20260314_131343 Paper: 2511.18920v1	EventSTU: Event-Guided Efficient Spatio-Temporal Understanding for Video Large Language Models Architecture: EventSTU is a training-free framework for Video LLMs that optimizes spatio-temporal processing. It utilizes simulated events (pixel changes between frames) to guide a coarse-to-fine keyframe sampling strategy (temp...	03-14 13:13	Success	-	View
exp_2512.03643v1_20260314_131246 Paper: 2512.03643v1	Optical Context Compression Is Just (Bad) Autoencoding Architecture: The study benchmarks DeepSeek-OCR’s Vision Encoder against two lightweight alternatives: parameter-free Mean Pooling and a learned Hierarchical Encoder. Memory Footprint & Speed: Vision encoders introduce significant p...	03-14 13:12	Success	-	View
exp_pytrain.20260314131022.037_20260314_131048 Paper: pytrain.20260314131022.037	Python Skill Fallback Title: Typed Package Scaffolder & Import Manager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 13:10	Success	-	View
exp_2503.23959v2_20260314_130909 Paper: 2503.23959v2	Local-Aware Token Pruning (ALTP) Benchmark Summary for ARES 8GB Roadmap Architecture: ALTP (Adaptive Local-Aware Token Pruning) accelerates Grounded Conversation Generation models (e.g., GLaMM, OMG-LLaVA) by integrating two lightweight modules: Detail Density Capture (DDC) a...	03-14 13:09	Success	-	View
exp_2504.02438v5_20260314_130819 Paper: 2504.02438v5	Benchmarking ViLAMP: Hierarchical Differential Distillation Architecture: ViLAMP introduces "Differential Distillation," a hierarchical method treating video tokens with "mixed precision." It isolates task-relevant keyframes for full-patch processing while compressing non-keyframes to query-sali...	03-14 13:08	Success	-	View
exp_2505.18051v3_20260314_130720 Paper: 2505.18051v3	LookWhere? Efficient Visual Recognition Benchmark Architecture: Introduces a dual-branch adaptive system comprising a low-resolution Selector (identifies ROIs) and a high-resolution Extractor (processes only relevant patches). This decouples "where to look" from "what to see,"...	03-14 13:07	Success	-	View
exp_2511.16943v2_20260314_130632 Paper: 2511.16943v2	RASTP: Representation-Aware Semantic Token Pruning for Generative Recommendation with Semantic Identifiers Architecture RASTP introduces a dynamic token pruning layer for Generative Recommendation systems. To handle the bloat caused by long Semantic Identifiers (SIDs), it calculates a composite importance score combining *Semantic Saliency...	03-14 13:06	Success	-	View
exp_pytrain.20260314130359.036_20260314_130424 Paper: pytrain.20260314130359.036	Typed Module Loader & Validator Overview This benchmark demonstrates a robust, autonomous system for safely loading and validating third-party Python modules at runtime. It simulates a package installation process where code is dynamically generated, written to disk, and...	03-14 13:04	Success	-	View
exp_2504.08934v1_20260314_125239 Paper: 2504.08934v1	This benchmark evaluates the GistPool methodology against standard Average Pooling for Long Context In-Context C... Architecture: GistPool is an in-context compression technique designed for decoder-only transformers. It addresses the information loss and capacity limitations of previous "Gisting" methods by integrating average pooling principles to...	03-14 13:02	Success	-	View
exp_2504.12778v1_20260314_125129 Paper: 2504.12778v1	Towards Lossless Token Pruning in Late-Interaction Retrieval Models Architecture: Modifies Late Interaction (ColBERT) training using regularization losses to force non-essential token embeddings to zero, enabling lossless static pruning. Memory Footprint: Critical for 8GB VRAM. Reduces index...	03-14 12:51	Success	-	View
exp_2504.16574v1_20260314_125024 Paper: 2504.16574v1	PIS: Prompt Importance Sampling Benchmark PIS Architecture: The paper proposes a dual-level compression framework utilizing a lightweight 9-layer Reinforcement Learning (RL) agent coupled with "Russian Roulette" semantic sampling. It quantifies token saliency using the target L...	03-14 12:50	Success	-	View
exp_pytrain.20260314124800.035_20260314_124836 Paper: pytrain.20260314124800.035	Python Skill Fallback Title: Dynamic Generic Plugin Pipeline - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 12:48	Success	-	View
exp_2504.21263v1_20260314_124621 Paper: 2504.21263v1	Embracing Collaboration Over Competition: Condensing Multiple Prompts for Visual In-Context Learning Architecture: Condenser is a lightweight, trainable external plugin for Visual In-Context Learning (VICL). Instead of selecting a single prompt or ensembling, it performs "prompt condensation," fusing fine-grained context from multiple...	03-14 12:46	Success	-	View
exp_2505.11471v1_20260314_124400 Paper: 2505.11471v1	CRISP: Efficiency Benchmark Simulation Architecture: CRISP modifies Multi-Vector retrieval (specifically ColBERT-style) by integrating clustering objectives directly into the end-to-end training loop. It learns to prune "noisy" tokens, creating representations that are inher...	03-14 12:44	Success	-	View
exp_pytrain.20260314124121.034_20260314_124143 Paper: pytrain.20260314124121.034	Dynamic Plugin Loader with Runtime Type Verification This benchmark evaluates the ability to implement a robust dynamic plugin loading system. It tests the candidate's proficiency with the `importlib` library, `typing.Protocol` for structural subtyping, and file system management using `pathl...	03-14 12:41	Success	-	View
exp_2505.13975v3_20260314_123932 Paper: 2505.13975v3	DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models Architecture: DRP utilizes a hybrid teacher-student framework. A teacher model performs skill-aware step decomposition to prune verbose reasoning chains. These compact paths are distilled into a student model via standard Supervised Fin...	03-14 12:39	Success	-	View
exp_2505.18757v2_20260314_123838 Paper: 2505.18757v2	ToDRE: Effective Visual Token Pruning via Token Diversity and Task Relevance ToDRE is a training-free, two-stage framework for efficient Large Vision-Language Model (LVLM) inference. * Architecture: 1. Token Diversity (Post-Encoder): Uses a greedy max-sum diversification algorithm to select representativ...	03-14 12:38	Success	-	View
exp_2506.04997v1_20260314_123741 Paper: 2506.04997v1	Benchmark Proposal: Light-ColPali/ColQwen2 (Token Merging) Architecture: Introduces Light-ColPali/ColQwen2, an optimization of late-interaction visual document retrievers (VDR) based on ColBERT-style architecture. Indexing & Strategy: Rejects token pruning (due to the loss of query-agno...	03-14 12:37	Success	-	View
exp_2407.05941v4_20260314_123656 Paper: 2407.05941v4	Pruning One More Token is Enough: Leveraging Latency-Workload Non-Linearities for Vision Transformers on the Edge Architecture: Introduces a training-free token pruning schedule for Vision Transformers (ViTs) that exploits non-linear latency-workload correlations specific to edge hardware. Memory Footprint: Significantly reduces activation memo...	03-14 12:36	Success	-	View
exp_pytrain.20260314123421.033_20260314_123457 Paper: pytrain.20260314123421.033	Python Reliability Drill: Typing & Packaging Benchmark This benchmark evaluates the robustness of a pure-Python "Inference Engine" simulation, focusing on strict type enforcement (`typing`), package metadata handling (`packaging`), and deterministic resource telemetry. It mocks the behavior of...	03-14 12:35	Success	-	View
exp_2407.08892v1_20260314_122306 Paper: 2407.08892v1	Benchmark: Prompt Compression Methods for Long Context Summary for ARES 8GB Roadmap This study evaluates three prompt compression paradigms—extractive, abstractive, and token pruning—to mitigate the high memory and compute costs of long-context inference. * Architecture: A comparative a...	03-14 12:33	Success	-	View
exp_2408.00274v1_20260314_122207 Paper: 2408.00274v1	QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression Architecture: QUITO is a lightweight, plug-in attention compressor for RAG pipelines. It computes the attention distribution of a "trigger token" (the query) over retrieved context tokens to identify and retain relevant information. **R...	03-14 12:22	Success	-	View
exp_2408.10497v3_20260314_122110 Paper: 2408.10497v3	QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory QUITO-X optimizes long-context handling for 8GB VRAM constraints by applying Information Bottleneck (IB) theory to compress prompts based on query relevance. * Architecture: Replaces standard self-information metrics with a **cross-...	03-14 12:21	Success	-	View
exp_pytrain.20260314121825.032_20260314_121900 Paper: pytrain.20260314121825.032	Type-Safe Plugin Registry Benchmark Overview This benchmark evaluates the implementation of a robust, type-safe plugin registry system in Python. It leverages Python's `typing.Protocol` to enforce structural subtyping (duck typing) at registration time, ensuring that all regi...	03-14 12:19	Success	-	View
exp_2409.14364v4_20260314_121649 Paper: 2409.14364v4	Position IDs Matter: An Enhanced Position Layout for Efficient Context Compression in Large Language Models Enhanced Position Layout (EPL) improves context compression via position ID manipulation. * Architecture: Modifies the position indices of special "gist" or compression tokens to minimize the distance to source context tokens, prese...	03-14 12:16	Success	-	View
exp_2402.18700v2_20260314_121458 Paper: 2402.18700v2	Benchmark: Natural Language Prompt Encapsulation (Nano-Capsulator) Paper: Learning to Compress Prompt in Natural Language Formats (Nano-Capsulator) * Architecture: Proposes a reinforcement learning framework that distills long prompts into dense "Capsule Prompts" in natural language. It utilizes a...	03-14 12:15	Success	-	View
exp_2309.15755v2_20260314_121358 Paper: 2309.15755v2	CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs Architecture CAIT proposes a dual-strategy compression pipeline for Vision Transformers (ViTs). It integrates Asymmetric Token Merging (ATME), which merges neighboring tokens to reduce sequence length while strictly preserving spati...	03-14 12:14	Success	-	View
exp_pytrain.20260314121101.031_20260314_121124 Paper: pytrain.20260314121101.031	Typed Micro-Package Architecture Benchmark This benchmark evaluates a candidate's ability to structure a Python script as a robust, installable micro-package. It focuses on strict static typing using `typing.Protocol` and proper namespace management using `__all__`. Benchmark Detail...	03-14 12:11	Success	-	View
exp_2309.16738v3_20260314_120919 Paper: 2309.16738v3	ELIP: Efficient Discriminative Language-Image Pre-training with Fewer Vision Tokens Paper: ELIP: Efficient Discriminative Language-Image Pre-training with Fewer Vision Tokens Architecture: ELIP proposes a trainable-parameter-free token pruning and merging mechanism for Vision Transformers (ViT) within Language-Imag...	03-14 12:09	Success	-	View
exp_2504.18579v4_20260314_120826 Paper: 2504.18579v4	Sparsity Forcing: Reinforcing Token Sparsity of MLLMs Architecture Introduces Sparsity Forcing, a Reinforcement Learning (RL) post-training framework for Multimodal LLMs (specifically Qwen2-VL/2.5-VL). It does not alter model weights but optimizes token selection by contrasting inference...	03-14 12:08	Success	-	View
exp_2512.00647v2_20260314_120739 Paper: 2512.00647v2	MambaScope: Coarse-to-Fine Scoping for Efficient Vision Mamba Summary for ARES 8GB Roadmap * Architecture: MambaScope proposes an adaptive "coarse-to-fine" wrapper for Vision Mamba (Vim). It replaces static high-resolution processing with a dynamic pipeline. The model initially processes the i...	03-14 12:07	Success	-	View
exp_2510.18043v1_20260314_120633 Paper: 2510.18043v1	CompactPrompt: A Unified Pipeline for Prompt Data Compression in LLM Workflows Architecture: CompactPrompt is a model-agnostic preprocessing pipeline. It utilizes "hard" prompt pruning via self-information scoring and dependency-based phrase grouping, paired with "soft" file-level compression (n-gram abbreviation...	03-14 12:06	Success	-	View
exp_pytrain.20260314120413.030_20260314_120435 Paper: pytrain.20260314120413.030	Dynamic Type-Safe Plugin Loader Benchmark This coding drill benchmarks your ability to construct a robust, runtime-validated plugin system using Python's standard library. You must implement a mechanism that dynamically discovers code modules within a temporary package structure, v...	03-14 12:04	Success	-	View
exp_2511.18691v1_20260314_120238 Paper: 2511.18691v1	EVCC: Enhanced Vision Transformer-ConvNeXt-CoAtNet Fusion Benchmark Architecture: EVCC is a multi-branch hybrid fusing ViT, ConvNeXt, and CoAtNet via a dynamic router gate and gated bidirectional cross-attention. Its primary efficiency mechanism is adaptive token pruning, which preserves information whi...	03-14 12:02	Success	-	View
exp_2512.08169v1_20260314_120119 Paper: 2512.08169v1	Information-Dense Reasoning for Efficient and Auditable Security Alert Triage Architecture: Hybrid cloud-edge framework (AIDR) employing a lightweight cloud router to dispatch alerts to specialized on-premise "expert" models for reasoning generation. Memory Footprint: Optimized for constrained environments. I...	03-14 12:01	Success	-	View
exp_2512.10324v1_20260314_120027 Paper: 2512.10324v1	Benchmark for EchoingPixels: Cross-Modal Adaptive Token Reduction Architecture: EchoingPixels optimizes Audio-Visual LLMs via the Cross-Modal Semantic Sieve (CS2). Instead of unimodal pruning, CS2 merges audio and video tokens into a single pool, using cross-modal co-attention to dynamically selec...	03-14 12:00	Success	-	View
exp_pytrain.20260314115746.029_20260314_115807 Paper: pytrain.20260314115746.029	Strict Package Metadata Inspector This coding drill validates your ability to use the Python standard library for system introspection and strict type safety. Objective Create a robust script `meta_inspector.py` (implemented within `benchmark.py`) that inspects installed Py...	03-14 11:58	Success	-	View
exp_2512.14244v4_20260314_115615 Paper: 2512.14244v4	EDU-based Context Compressor: Benchmark Architecture: Proposes a two-stage "structure-then-select" pipeline. First, LingoEDU parses linear text into a structural relation tree of Elementary Discourse Units (EDUs) anchored to source indices to prevent hallucinations. Second,...	03-14 11:56	Success	-	View
exp_2503.20384v2_20260314_115533 Paper: 2503.20384v2	Benchmark for MoLe-VLA: Dynamic Layer-skipping VLA Architecture: MoLe-VLA transforms static LLM inference into a dynamic "Mixture-of-Layers" framework. A Spatial-Temporal Aware Router (STAR) selectively activates specific LLM layers based on the robot's current state, treating layer...	03-14 11:55	Success	-	View
exp_2504.16786v1_20260314_115448 Paper: 2504.16786v1	MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores MOOSComp Analysis for ARES 8GB Roadmap * Architecture: Utilizes a lightweight BERT-based encoder for token classification. It mitigates over-smoothing via an inter-class cosine similarity loss during training and incorporates outlie...	03-14 11:54	Success	-	View
exp_2505.12215v2_20260314_115404 Paper: 2505.12215v2	GMSA Context Compression Benchmark Architecture: GMSA is an encoder-decoder framework designed to compress long-context inputs into a compact sequence of "soft tokens." It utilizes Group Merging to ensure uniform semantic aggregation and **Layer Semantic Alignment (L...	03-14 11:54	Success	-	View
exp_2403.15388v6_20260314_115326 Paper: 2403.15388v6	LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models Architecture: PruMerge inserts a lightweight optimization module between the visual encoder (e.g., CLIP) and the LLM. It utilizes a two-stage strategy: Pruning discards redundant visual tokens based on attention sparsity between the...	03-14 11:53	Success	-	View
exp_pytrain.20260314115105.028_20260314_115138 Paper: pytrain.20260314115105.028	Environment Metadata Auditor with PEP 695 Generics This drill verifies the ability to inspect the Python runtime environment using standard library tools (`importlib.metadata`) and modern typing features introduced in Python 3.12 (PEP 695 Type Parameter Syntax). Objective Create a script `b...	03-14 11:51	Success	-	View
exp_2510.08907v4_20260314_113950 Paper: 2510.08907v4	Semantic-Anchor Compression (SAC) Benchmark Architecture: Proposes Semantic-Anchor Compression (SAC), eliminating the need for autoencoding-based training. The method selects specific "anchor" tokens from the input context and aggregates information from the entire text into thei...	03-14 11:49	Success	-	View
exp_2512.01949v1_20260314_113902 Paper: 2512.01949v1	Script: Graph-Structured and Query-Conditioned Semantic Token Pruning for Multimodal Large Language Models Architecture: Script proposes a plug-and-play, training-free pipeline featuring two core modules: a graph-structured pruning module (to remove spatial redundancy) and a query-conditioned semantic pruning module (to retain task-relevant...	03-14 11:39	Success	-	View
exp_2505.15774v1_20260314_113818 Paper: 2505.15774v1	Hybrid Context Compression (HyCo2) Benchmark Paper: Beyond Hard and Soft: Hybrid Context Compression for Balancing Local and Global Information Retention Architecture: HyCo2 introduces a dual-module context compressor. It utilizes a hybrid adapter to refine global semantic...	03-14 11:38	Success	-	View
exp_pytrain.20260314113600.027_20260314_113620 Paper: pytrain.20260314113600.027	Robust Dynamic Plugin Loader with Protocol Validation Overview This coding drill benchmark tests your ability to design a robust, type-safe plugin architecture using only the Python Standard Library. It simulates an environment where code must be loaded dynamically at runtime from temporary fi...	03-14 11:36	Success	-	View
exp_2506.07851v2_20260314_113441 Paper: 2506.07851v2	Learning to Focus (LeaF) Benchmark Paper: Learning to Focus (LeaF) Architecture: LeaF is a training-phase distillation framework that utilizes a larger teacher model to perform gradient-based interventions. It identifies "confounding" tokens (distractors) in the...	03-14 11:34	Success	-	View
exp_2408.11799v1_20260314_113339 Paper: 2408.11799v1	Practical token pruning for foundation models in few-shot conversational virtual assistant systems Architecture: Utilizes contrastive-pretrained Sentence Transformers for intent classification. The core innovation is a Dynamic Token Pruning mechanism implemented via a multi-task adaptation approach, allowing the model to skip pro...	03-14 11:33	Success	-	View
exp_2409.13035v3_20260314_113249 Paper: 2409.13035v3	TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning Architecture: Utilizes a lightweight Transformer encoder (token classification policy) trained via the REINFORCE algorithm. Unlike task-agnostic pruning, it optimizes retention decisions using task-specific reward signals (e.g., ROUGE,...	03-14 11:32	Success	-	View
exp_2505.18227v3_20260314_113156 Paper: 2505.18227v3	Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality Architecture: Position paper proposing unified token reduction (pruning/merging) strategies across Vision, Language, and Multimodal Transformers. Reframes reduction as a core design principle for model alignment and stability, not just...	03-14 11:32	Success	-	View
exp_pytrain.20260314112929.026_20260314_113006 Paper: pytrain.20260314112929.026	Python Skill Fallback Title: Type-Safe Component Registry with Dynamic Configuration - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 11:30	Success	-	View
exp_2505.18227v3_20260314_112742 Paper: 2505.18227v3	Benchmark Proposal: Semantic Token Reduction for Quality and Efficiency Architecture: Position paper proposing unified token reduction (pruning/merging) strategies across Vision, Language, and Multimodal Transformers. Reframes reduction as a core design principle for model alignment and stability, not just...	03-14 11:27	Success	-	View
exp_2511.18950v1_20260314_112654 Paper: 2511.18950v1	Compressor-VLA: Instruction-Guided Visual Token Compression for Efficient Robotic Manipulation Architecture Compressor-VLA introduces a hybrid, instruction-conditioned compression framework. It utilizes two distinct modules: a Semantic Task Compressor (STC) for holistic context and a Spatial Refinement Compressor (SRC) for fine-g...	03-14 11:26	Success	-	View
exp_2407.09014v3_20260314_112556 Paper: 2407.09014v3	Benchmark: CompAct (Compressing Retrieved Documents Actively) Architecture: Modular plug-in framework utilizing off-the-shelf dense retrievers (e.g., Contriever) and an iterative "Active Selector" policy network. Unlike static one-shot filters, it sequentially selects documents based on the evolvi...	03-14 11:26	Success	-	View
exp_2510.09156v1_20260314_112516 Paper: 2510.09156v1	Agentic-KGR: Co-evolutionary Knowledge Graph Construction through Multi-Agent Reinforcement Learning Architecture: A multi-agent reinforcement learning (RL) framework designed to co-evolve LLMs with Knowledge Graphs (KGs), specifically integrating with GraphRAG. Retrieval & Context: * Architecture: GraphRAG. * Indexing:...	03-14 11:25	Success	-	View
exp_pytrain.20260314112300.025_20260314_112327 Paper: pytrain.20260314112300.025	Type-Safe Dynamic ZipApp Packager This benchmark evaluates a system's ability to programmatically construct a Python application, perform static type checking to enforce interface compliance using `typing.Protocol`, and package the result into a standalone executable ZipApp...	03-14 11:23	Success	-	View
exp_2511.09883v1_20260314_112139 Paper: 2511.09883v1	HCC-3D: Hierarchical Compensatory Compression for 98% 3D Token Reduction in Vision-Language Models Architecture: HCC-3D solves the 3D-VLM context bottleneck where dense point-cloud tokens overwhelm the LLM. It utilizes a two-stage compressor preceding the LLM: Global Structure Compression (GSC), which employs learnable queries to agg...	03-14 11:21	Success	-	View
exp_2601.02365v1_20260314_112051 Paper: 2601.02365v1	FUSE: Failure-aware Usage of Subagent Evidence Architecture: FUSE replaces raw image prompting with a Grounded Design Representation (GDR), a compact JSON schema encoding canvas elements, styles, and structure. It utilizes a subagent architecture where tasks are routed to sp...	03-14 11:20	Success	-	View
exp_2511.14582v1_20260314_112006 Paper: 2511.14582v1	OmniZip: Audio-Guided Dynamic Token Compression Benchmark Architecture OmniZip is a training-free middleware framework for Omnimodal LLMs. It optimizes inference by using audio modality as an anchor to guide video token compression. The architecture calculates "audio retention scores" to ident...	03-14 11:20	Success	-	View
exp_2511.19718v1_20260314_111913 Paper: 2511.19718v1	Benchmark: Structural Reparameterization for Efficient Vision Transformers Architecture: Proposes a structural reparameterization technique that trains parallel multi-branch ViT blocks (spanning FFN and MHSA) which are mathematically consolidated into a single-path architecture for deployment. **Memory & Speed...	03-14 11:19	Success	-	View
exp_pytrain.20260314111631.024_20260314_111702 Paper: pytrain.20260314111631.024	Python Skill Fallback Title: Generic Pipeline CLI Engine - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 11:17	Success	-	View
exp_2504.03165v3_20260314_111446 Paper: 2504.03165v3	Benchmark for EDC2-RAG: Efficient Dynamic Clustering for RAG Architecture: EDC2-RAG is a post-retrieval optimization layer. It utilizes dynamic clustering (grouping retrieved chunks by semantic similarity) to identify and remove redundancy and noise before sending context to the LLM. **Retrieval...	03-14 11:14	Success	-	View
exp_2505.07861v3_20260314_111333 Paper: 2505.07861v3	Benchmark: Caprese - Scalable LLM Reasoning Acceleration Paper: Scalable LLM Reasoning Acceleration with Low-rank Distillation (Caprese) Architecture: Proposes low-rank distillation applied to feedforward (FFN) layers to recover math reasoning capabilities lost during quantization or prun...	03-14 11:13	Success	-	View
exp_2505.13506v1_20260314_111220 Paper: 2505.13506v1	EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation Architecture: A plug-and-play security module using "bait-guided" context diversity detection and sentence-level processing to filter corpus poisoning without relying on LLM internal knowledge. Retrieval Strategy: Functions as a pos...	03-14 11:12	Success	-	View
exp_pytrain.20260314111001.023_20260314_111029 Paper: pytrain.20260314111001.023	Benchmark: Asynchronous Plugin Loader with Strict Protocol Enforcement Overview This benchmark tests the ability to construct a robust, in-memory plugin architecture using Python's standard library. It combines `typing.Protocol` for strict interface definition and `asyncio` for concurrent execution to simulate...	03-14 11:10	Success	-	View
exp_2505.21334v3_20260314_110619 Paper: 2505.21334v3	HoliTom: Holistic Token Merging Benchmark Architecture: HoliTom introduces a training-free, dual-stage framework combining "Outer-LLM" and "Inner-LLM" token merging. 1. Outer-LLM: Performs global redundancy-aware temporal segmentation and spatio-temporal merging to handle l...	03-14 11:08	Success	-	View
exp_2506.12723v3_20260314_110527 Paper: 2506.12723v3	SP-VLA: A Joint Model Scheduling and Token Pruning Approach for VLA Model Acceleration Architecture: SP-VLA accelerates Vision-Language-Action models through joint model scheduling and token pruning. It introduces a dynamic scheduler that classifies actions as "deliberative" (requiring full VLA) or "intuitive" (offloaded...	03-14 11:05	Success	-	View
exp_2407.20485v2_20260314_110431 Paper: 2407.20485v2	A2SF: Accumulative Attention Scoring with Forgetting Factor Architecture: A2SF refines KV cache eviction logic in decoder-only models. It addresses the bias inherent in causal masking (where older tokens accumulate artificially high attention scores) by introducing a "Forgetting Factor" ($\gamma...	03-14 11:04	Success	-	View
exp_2401.07469v1_20260314_110319 Paper: 2401.07469v1	SUReID Benchmark Architecture: SUReID utilizes a Vision Transformer backbone featuring Hierarchical Token Sparsification (HTS). HTS dynamically prunes redundant and occluded tokens prior to the self-attention layer, effectively streamlining feature...	03-14 11:03	Success	-	View
exp_pytrain.20260314110052.022_20260314_110125 Paper: pytrain.20260314110052.022	Python Skill Fallback Title: PEP 695 Generic Result Monad Implementation - Focus: Typing, Packaging - Note: Generated fallback due to unavailable model output.	03-14 11:01	Success	-	View
exp_2510.18866v4_20260314_104850 Paper: 2510.18866v4	LightMem Benchmark Architecture: LightMem implements a three-stage memory pipeline inspired by human cognition: Sensory Memory (rapid filtering and topic-based compression), Short-Term Memory (topic-aware consolidation and summarization), and **Lo...	03-14 10:58	Success	-	View
exp_2511.12428v1_20260314_104755 Paper: 2511.12428v1	RedVTP: Training-Free Acceleration of Diffusion Vision-Language Models Inference via Masked Token-Guided Visual Token Pr... Architecture: RedVTP targets Diffusion Vision-Language Models (DVLMs) like LLaDA-V and LaViDa. It introduces a training-free, response-driven strategy to prune redundant visual tokens during parallel decoding. Memory Footprint: Sign...	03-14 10:47	Success	-	View
exp_2503.23455v1_20260314_104702 Paper: 2503.23455v1	Efficient Token Compression for Vision Transformer with Spatial Information Preserved Architecture: Introduces "Prune and Merge," a layer-wise compression module for Vision Transformers (ViTs). It integrates trainable merge and reconstruct matrices with shortcut connections to aggregate spatial information while discardi...	03-14 10:47	Success	-	View
exp_2506.05096v4_20260314_104557 Paper: 2506.05096v4	Astraea: Token-wise Acceleration Benchmark Architecture: Introduces a plug-in acceleration framework for Video Diffusion Transformers (vDiTs) centered on a lightweight token selection mechanism and a memory-efficient, GPU-compatible sparse attention strategy. **Optimization Stra...	03-14 10:46	Success	-	View
exp_pytrain.20260314104338.021_20260314_104404 Paper: pytrain.20260314104338.021	Self-Validating Entry-Point Loader Benchmark Overview This benchmark tests a developer's ability to construct a robust runtime plugin loader using Python's standard `typing` and `importlib` libraries. It simulates a micro-kernel architecture where functionality is discovered dynamical...	03-14 10:44	Success	-	View
exp_2406.20092v2_20260314_104132 Paper: 2406.20092v2	Visual Context Compression Benchmark Architecture: Proposes a Visual Context Compressor to prune redundant visual tokens. This is integrated using LLaVolta, a staged training scheme that progressively increases compression (heavy to light) to maintain visual semant...	03-14 10:41	Success	-	View
exp_2409.11182v1_20260314_104038 Paper: 2409.11182v1	Video Token Sparsification (VTS) Benchmark Architecture: VTS integrates a lightweight CNN-based proposal network to preprocess video inputs. It adaptively selects key frames and prunes redundant visual tokens to minimize the context window passed to the multimodal LLM. **Memory...	03-14 10:40	Success	-	View
exp_2510.19183v1_20260314_103920 Paper: 2510.19183v1	PruneHal: Multi-modal LLM Hallucination Mitigation Benchmark Architecture: PruneHal targets multimodal LLMs (MLLMs) by introducing adaptive KV cache pruning specifically for visual tokens. It identifies that redundant visual tokens dilute attention, causing hallucinations. The architecture dynami...	03-14 10:39	Success	-	View
exp_pytrain.20260314103540.020_20260314_103631 Paper: pytrain.20260314103540.020	Dynamic Kernel Dispatcher Benchmark This benchmark implements a robust, type-safe kernel registration and dispatch system. It mimics the architecture of high-performance libraries (like PyTorch or FlashAttention) where specific computational kernels are dynamically registered...	03-14 10:36	Success	-	View
exp_2510.20797v1_20260314_103321 Paper: 2510.20797v1	Simple Context Compression: Mean-Pooling and Multi-Ratio Training Architecture: Proposes a Mean-Pooling compressor for soft context compression within RAG pipelines. This replaces the heavier "compression-tokens" architecture by averaging embeddings. It employs multi-ratio training, enabli...	03-14 10:33	Success	-	View
exp_2511.08003v2_20260314_103218 Paper: 2511.08003v2	SharpV Benchmark SharpV Summary for ARES 8GB Roadmap Architecture: SharpV introduces a two-stage pruning framework to mitigate VideoLLM quadratic complexity. It first performs spatial-temporal adaptive token pruning (removing redundant frames/patche...	03-14 10:32	Success	-	View
exp_2511.17129v2_20260314_103119 Paper: 2511.17129v2	Benchmark: LLM2Comp Context Compression Efficiency Architecture: LLM2Comp adapts causal LLMs via a context compression pretext task. The model splits into a Compressor and a Predictor, learning to generate fixed-size "memory tokens" that represent the full context for sequence p...	03-14 10:31	Success	-	View
exp_pytrain.20260314102808.019_20260314_102832 Paper: pytrain.20260314102808.019	Type-Guarded Plugin Loader with Semantic Versioning Overview This benchmark tests the ability to construct a robust, type-safe plugin system using only the Python standard library. It simulates an environment where "Backend" models must be loaded dynamically based on strict interface complia...	03-14 10:28	Success	-	View
exp_2511.18832v1_20260314_101648 Paper: 2511.18832v1	This benchmark evaluates the performance impact of the "Concept than Document" context compression strategy. Architecture: Unsupervised AMR (Abstract Meaning Representation) graph compression framework. RAG Details: * Retrieval Strategy: Post-retrieval semantic filtering. It parses retrieved documents into AMR graphs to extract sem...	03-14 10:26	Success	-	View
exp_2512.04550v1_20260314_101550 Paper: 2512.04550v1	AdmTree: Context Compression Benchmark Architecture AdmTree implements a semantic binary tree for hierarchical context compression. Input is dynamically segmented based on information density, with variable-length segments converted into "gist tokens" at leaf nodes. A lightw...	03-14 10:15	Success	-	View
exp_pytrain.20260314101323.018_20260314_101402 Paper: pytrain.20260314101323.018	Strictly-Typed Event Dispatcher with Protocol Constraints This benchmark tests your ability to design a robust, type-safe event system using Python's advanced type hinting features (`Protocol`, `Generic`, `TypeVar`). The goal is to create a generic `EventBus` that enforces structural subtyping (du...	03-14 10:14	Success	-	View
exp_2512.13956v2_20260314_101048 Paper: 2512.13956v2	Benchmark: AOI vs. Standard LLM Agent Architecture: AOI proposes a multi-agent framework integrating three specialized agents with an LLM-based Context Compressor. It features a three-layer memory hierarchy (Working, Episodic, Semantic) and a dynamic task scheduler for...	03-14 10:10	Success	-	View
exp_2505.18458v3_20260314_100946 Paper: 2505.18458v3	LLM x DATA: KV-Cache Management Benchmark Paper: A Survey of LLM $\times$ DATA Architecture & Feasibility: This is a broad survey (DATA4LLM) proposing a paradigm where inference is treated as a data-serving problem. It does not introduce a specific model architecture but re...	03-14 10:09	Success	-	View
exp_2406.19251v1_20260314_100753 Paper: 2406.19251v1	AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation Architecture: AutoRAG-HP implements a two-level Hierarchical Multi-Armed Bandit (Hier-MAB) to automate RAG hyperparameter tuning online. RAG Specifics: Optimizes dense retrieval pipelines by dynamically adjusting top-k document co...	03-14 10:07	Success	-	View
exp_pytrain.20260314100532.017_20260314_100603 Paper: pytrain.20260314100532.017	Type-Safe Dynamic Plugin Registry This coding drill demonstrates how to architect a modular, type-safe application by programmatically generating Python packages and enforcing runtime interface contracts. Overview The `benchmark.py` script performs the following complex ope...	03-14 10:06	Success	-	View
exp_2510.12856v1_20260314_100322 Paper: 2510.12856v1	Efficient Adaptive Transformer (EAT) Benchmark Architecture: EAT integrates progressive token pruning, sparse attention, and dynamic early exiting into a unified 6-layer encoder (DistilBERT-based) designed for input-adaptive computation. Memory Footprint: While token pruning and...	03-14 10:03	Success	-	View
exp_2510.17197v1_20260314_100206 Paper: 2510.17197v1	ZSPAPrune: Zero-Shot Prompt-Aware Token Pruning for Vision-Language Models Architecture ZSPAPrune introduces a zero-shot, hierarchical token pruning strategy for Vision-Language Models (VLMs). It operates in two stages: 1. Prompt-Guided Selection: Identifies visual tokens with high attentional relevance to...	03-14 10:02	Success	-	View
exp_2510.18234v1_20260314_100057 Paper: 2510.18234v1	DeepSeek-OCR: Optical Compression Benchmark Architecture: Hybrid system utilizing `DeepEncoder` (compression engine) and a `DeepSeek3B-MoE-A570M` decoder. It maps dense text and high-resolution images into "optical 2D maps" represented as sparse vision tokens. **Memory Footprint:...	03-14 10:01	Success	-	View
exp_pytrain.20260314095803.016_20260314_095838 Paper: pytrain.20260314095803.016	Structural Subtyping and Dynamic Module Discovery This benchmark tests the implementation of a flexible plugin architecture using Python's `typing.Protocol` for structural subtyping and runtime discovery mechanisms. Objective Create a single-file Python script that: 1. **Defines a Protocol...	03-14 09:58	Success	-	View
exp_2511.02650v2_20260314_095618 Paper: 2511.02650v2	Can Visual Input Be Compressed? A Visual Token Compression Benchmark for Large Multimodal Models Architecture: Introduces UniPruneBench, a standardized benchmark for evaluating visual token pruning (and merging) strategies in LMMs (LLaVA, InternVL, Qwen2.5-VL). Memory Footprint: Focuses on reducing the massive token seq...	03-14 09:56	Success	-	View
exp_2511.11139v2_20260314_095508 Paper: 2511.11139v2	Speech-Aware Long Context Pruning and Integration for Contextualized Automatic Speech Recognition Architecture: The paper proposes SAP$^{2}$, a dual-stage framework utilizing Speech-Driven Attention-based Pooling (SDAP). This module dynamically compresses long textual context (e.g., presentation slides) into dense embeddings...	03-14 09:55	Success	-	View
exp_2505.20698v1_20260314_095359 Paper: 2505.20698v1	Sparsified State-Space Models (Simba) Benchmark Architecture: Simba proposes a sparsified Mamba (SSM) architecture using hierarchical token pruning. It retains dense processing in lower layers to capture local features while aggressively pruning tokens in upper layers to establish "h...	03-14 09:54	Success	-	View
exp_pytrain.20260314095037.015_20260314_095100 Paper: pytrain.20260314095037.015	Strictly Typed Generic Result Container Module Benchmark This benchmark tests the creation and usage of a strictly typed `Result[T, E]` monad container. It enforces proper encapsulation using `__all__`, utilizes `typing.Generic` and `dataclasses`, and validates the contract safety provided by PEP...	03-14 09:51	Success	-	View
exp_2506.11886v1_20260314_094858 Paper: 2506.11886v1	Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache Architecture FourierAttention is a training-free framework optimizing the KV cache by exploiting the heterogeneous roles of attention heads. It maintains local context in lower dimensions while compressing long-range dependencies in upp...	03-14 09:49	Success	-	View
exp_2506.13166v1_20260314_094730 Paper: 2506.13166v1	GreedyPrune: Retenting Critical Visual Token Set for Large Vision Language Models Architecture GreedyPrune is a training-free, plug-and-play visual token pruning module. It formalizes token selection as a combinatorial optimization problem, utilizing a greedy algorithm to jointly maximize semantic saliency (importanc...	03-14 09:47	Success	-	View
exp_2407.12077v1_20260314_094618 Paper: 2407.12077v1	GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression Architecture: GoldFinch is a hybrid stacking an enhanced RWKV-6 ("Finch") base with a novel "GOLD" Transformer top. It combines RNN recurrence with linear attention mechanisms to balance efficient state management with high-performance...	03-14 09:46	Success	-	View
exp_pytrain.20260314094329.014_20260314_094405 Paper: pytrain.20260314094329.014	Python Skill Fallback Title: Strictly Typed Package Scaffolder - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 09:44	Success	-	View
exp_2403.08312v3_20260314_094055 Paper: 2403.08312v3	StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses Architecture: StreamingDialogue compresses long dialogue histories into "conversational attention sinks" located at End-of-Utterance (EoU) tokens. It replaces dense full-context attention with a compressed representation, utilizing Shor...	03-14 09:40	Success	-	View
exp_2510.07293v1_20260314_093915 Paper: 2510.07293v1	AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs AudioMarathon Benchmark Analysis * Architecture & Scope: This is a benchmark paper evaluating Large Audio Language Models (LALMs) on long-form audio (90s–300s). It exposes the limitations of standard Transformer attention ($O(N^2)$)...	03-14 09:39	Success	-	View
exp_2401.03462v3_20260314_093830 Paper: 2401.03462v3	Long Context Compression with Activation Beacon Architecture: Introduces a "plug-in" module that directly compresses Keys and Values (KV) activations at every transformer layer. Unlike soft prompt methods, it uses a progressive, fine-grained workflow where compression is trained via...	03-14 09:38	Success	-	View
exp_pytrain.20260314093515.013_20260314_093615 Paper: pytrain.20260314093515.013	Strict-Typed Kernel API Design Benchmark Objective This benchmark validates the implementation of a robust, strictly-typed kernel API design using Python's type hinting system (`typing.Protocol`, `typing.Generic`, `typing.TypeVar`) and module encapsulation (`__all__`). Design Brie...	03-14 09:36	Success	-	View
exp_2510.18269v1_20260314_093331 Paper: 2510.18269v1	StreamingTOM: Streaming Token Compression for Efficient Video Understanding Architecture: StreamingTOM is a training-free, two-stage framework for streaming video LLMs. It decouples efficiency into: 1. Causal Temporal Reduction (Pre-LLM): Enforces a fixed visual budget per frame by selecting tokens based on...	03-14 09:33	Success	-	View
exp_2510.22101v1_20260314_093212 Paper: 2510.22101v1	Efficient SLM Semantic Search Benchmark Summary for ARES 8GB Roadmap * Architecture: Decoder-only SLM tailored for semantic search. * Memory Footprint: Structural pruning reduces model size by 40%, while context compression techniques decrease input sequence length by...	03-14 09:32	Success	-	View
exp_2504.04514v2_20260314_093123 Paper: 2504.04514v2	Saliency-driven Dynamic Token Pruning for Large Language Models Architecture: SDTP integrates a lightweight saliency-driven prediction module into LLM layers to estimate token importance via hidden states. It employs hierarchical pruning to dynamically discard redundant tokens layer-by-layer. **Memo...	03-14 09:31	Success	-	View
exp_pytrain.20260314092842.012_20260314_092911 Paper: pytrain.20260314092842.012	Generic Plugin Registry with Dynamic Module Loading This benchmark evaluates an implementation of a robust, type-safe plugin architecture using Python's `typing` module and standard library introspection tools. Overview The script implements a `PluginRegistry` generic class capable of storin...	03-14 09:29	Success	-	View
exp_2506.02850v2_20260314_092706 Paper: 2506.02850v2	METok: Multi-Stage Event-based Token Compression Benchmark Architecture METok is a training-free, three-stage token compression pipeline for Video LLMs: 1. Event-aware Compression: Reduces redundancy during vision encoding. 2. Hierarchical Pruning: Filters tokens during the prefill stag...	03-14 09:27	Success	-	View
exp_2506.05167v2_20260314_092611 Paper: 2506.05167v2	ECoRAG: Evidentiality-guided Compression Benchmark Architecture: ECoRAG proposes an iterative retrieval framework. It utilizes an evidentiality-guided compression module that functions as a semantic filter/reranker, processing retrieved chunks to retain only information strictly...	03-14 09:26	Success	-	View
exp_2506.11092v2_20260314_092516 Paper: 2506.11092v2	Dynamic Context Tuning for Retrieval-Augmented Generation: Enhancing Multi-Turn Planning and Tool Adaptation Architecture: DCT is a lightweight RAG wrapper featuring an attention-based context cache and a LoRA-based retrieval router to handle dynamic tools and multi-turn history. Retrieval & Context: * Retrieval Architecture: Uses LoRA...	03-14 09:25	Success	-	View
exp_2407.09252v3_20260314_092410 Paper: 2407.09252v3	Context Embeddings for Efficient Answer Generation in RAG Architecture: COCOM proposes a compression module that encodes retrieved documents into a fixed set of Context Embeddings, bypassing the processing of long text sequences during decoding. RAG Specifics: * Retrieval Strategy: Ope...	03-14 09:24	Success	-	View
exp_pytrain.20260314092059.011_20260314_092203 Paper: pytrain.20260314092059.011	AST-Based Type Compliance Checker Benchmark This benchmark defines a task for an autonomous coding agent to create a static analysis tool named `pkg_typing_guard.py`. The tool must recursively scan a given directory, identify valid Python packages (directories containing `__init__.py...	03-14 09:22	Success	-	View
exp_2408.05933v1_20260314_092024 Paper: 2408.05933v1	Optimizing RAG Techniques for Automotive Industry PDF Chatbots: A Case Study with Locally Deployed Ollama Models Architecture & Feasibility: This paper proposes a Self-RAG agent architecture using LangGraph and Ollama, designed for local, low-resource environments. It is highly feasible for 8GB VRAM roadmaps, leveraging Ollama’s qu...	03-14 09:20	Success	-	View
exp_2409.10593v3_20260314_091855 Paper: 2409.10593v3	CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios Architecture: CSKV targets KV cache redundancy via channel-level low-rank decomposition on Key/Value projection layers. It utilizes a hybrid "bi-branch" cache: a sliding window preserves full-precision local context, while the global hi...	03-14 09:18	Success	-	View
exp_2512.00504v1_20260314_091749 Paper: 2512.00504v1	G-KV: Decoding-Time KV Cache Eviction with Global Attention Architecture: G-KV introduces a decoding-time KV eviction mechanism utilizing a global scoring function. It combines local attention patterns with historical importance metrics to accurately identify and prune redundant tokens. To count...	03-14 09:17	Success	-	View
exp_2512.11920v1_20260314_091636 Paper: 2512.11920v1	CXL-SpecKV: A Disaggregated FPGA Speculative KV-Cache for Datacenter LLM Serving CXL-SpecKV targets the memory bandwidth bottleneck of LLM serving by disaggregating Key-Value (KV) caches from GPU VRAM. * Architecture: Uses Compute Express Link (CXL) to offload KV storage to remote FPGA memory, decoupling memory...	03-14 09:16	Success	-	View
exp_pytrain.20260314091415.010_20260314_091438 Paper: pytrain.20260314091415.010	Python Skill Fallback Title: AsyncIO Data Pipeline with Strict Typing and Module Structure - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 09:14	Success	-	View
exp_2503.23367v3_20260314_091258 Paper: 2503.23367v3	FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning Architecture: FastVAR is a post-training acceleration framework for Visual Autoregressive (VAR) models. It introduces a "cached token pruning" strategy that identifies converged tokens during the final (large-scale) generation step. Ins...	03-14 09:13	Success	-	View
exp_2504.00557v1_20260314_091155 Paper: 2504.00557v1	README: Efficient LLaMA-3.2-Vision Benchmark Architecture Targets cross-attention-based LVLMs (specifically LLaMA-3.2-Vision). Unlike prior methods focused on self-attention, this approach exploits sparsity in cross-attention maps to identify and prune redundant visual features di...	03-14 09:11	Success	-	View
exp_2505.15394v1_20260314_091057 Paper: 2505.15394v1	Reranking with Compressed Document Representation Architecture & RAG: Proposes a pipeline utilizing a first-stage retriever, a document compressor, and a distilled 1B-parameter reranker. Instead of processing raw text, the reranker consumes fixed-size embedding representations of docum...	03-14 09:11	Success	-	View
exp_pytrain.20260314090712.009_20260314_090812 Paper: pytrain.20260314090712.009	Robust Distribution Metadata Inspector A Python CLI tool and coding drill benchmark designed to introspect environment packaging metadata using the standard library. This tool enforces strict type safety and gracefully handles missing or corrupt package data. Features * **Zero D...	03-14 09:08	Success	-	View
exp_2407.01527v2_20260314_085525 Paper: 2407.01527v2	KV Cache Compression Benchmark Paper: KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches Summary for ARES 8GB Roadmap: This study provides a critical benchmark for long-context inference strategies,...	03-14 09:05	Success	-	View
exp_2403.12968v2_20260314_085423 Paper: 2403.12968v2	This benchmark evaluates the efficiency of the LLMLingua-2 methodology, which employs a small Transformer encoder (simul... Architecture: Replaces unidirectional entropy-based models (LLaMA-7B) with a bidirectional Transformer Encoder (e.g., XLM-RoBERTa-large). Formulates compression as a token classification problem, using data distillation to train...	03-14 08:54	Success	-	View
exp_2307.06945v4_20260314_085314 Paper: 2307.06945v4	ICAE Efficiency Benchmark Architecture: Introduces the In-context Autoencoder (ICAE), a lightweight wrapper (~1% parameter overhead) for Llama models. It utilizes a two-stage training pipeline (autoencoding + instruction tuning) to compress long contexts into de...	03-14 08:53	Success	-	View
exp_pytrain.20260314084951.008_20260314_085030 Paper: pytrain.20260314084951.008	Python Skill Fallback Title: Metadata-Aware Typed Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 08:50	Success	-	View
exp_2308.14508v2_20260314_083802 Paper: 2308.14508v2	LongBench: Long Context Understanding Benchmark Paper Type: Benchmark / Evaluation Study. Relevance to ARES 8GB: LongBench standardizes evaluation for long-context understanding across 21 datasets (avg. length 6,711 words). While it proposes no new architecture, it offers critica...	03-14 08:48	Success	-	View
exp_2510.13799v1_20260314_083655 Paper: 2510.13799v1	BRIEF-Pro: Universal Context Compression with Short-to-Long Synthesis for Fast and Accurate Multi-Hop Reasoning Architecture: BRIEF-Pro is a lightweight, universal compressor model utilizing "short-to-long synthesis" to perform abstractive summarization of retrieved documents, specifically trained to handle contexts exceeding 10k words. **RAG Imp...	03-14 08:36	Success	-	View
exp_2510.20535v1_20260314_083608 Paper: 2510.20535v1	Benchmark: ARC-Encoder Efficiency Simulation Architecture: ARC-Encoder is a standalone compression model mapping $N$ text tokens to $N/x$ continuous vectors ($x \in \{4, 8\}$). These vectors replace standard token embeddings at the input layer of a frozen decoder LLM. **Memory Foo...	03-14 08:36	Success	-	View
exp_2512.12701v1_20260314_083506 Paper: 2512.12701v1	Efficient Vision-Language Reasoning via Adaptive Token Pruning Architecture ATP introduces a lightweight gating module at the vision-language interface. It dynamically prunes visual tokens by ranking them via a hybrid importance score (combining ViT intra-modal attention and CLIP text-image similar...	03-14 08:35	Success	-	View
exp_pytrain.20260314083214.007_20260314_083251 Paper: pytrain.20260314083214.007	Dynamic Plugin Registry Benchmark This benchmark evaluates an autonomous agent's ability to construct a robust, extensible plugin system using the Python standard library. It specifically targets the combination of `typing.Protocol` for Structural Subtyping (Duck Typing wit...	03-14 08:32	Success	-	View
exp_2505.23277v2_20260314_083034 Paper: 2505.23277v2	Sentinel: Decoding Context Utilization via Attention Probing for Efficient LLM Context Compression Sentinel optimizes RAG inference by treating context compression as an attention-decoding task. * Architecture: Uses a lightweight 0.5B proxy model with a trained "readout" module to probe the frozen target LLM's attention p...	03-14 08:30	Success	-	View
exp_2407.08454v2_20260314_082827 Paper: 2407.08454v2	Benchmark for Adaptive KV Cache Merging (KVMerger) Paper: Model Tells You Where to Merge (KVMerger) * Architecture: KVMerger optimizes the Transformer attention mechanism by compressing the KV cache. It utilizes a Merging Set Identification algorithm to group tokens based on i...	03-14 08:28	Success	-	View
exp_2409.01579v1_20260314_082729 Paper: 2409.01579v1	AdaComp: Adaptive Context Compression Benchmark Architecture: AdaComp augments standard Dense Retrieval pipelines (Retriever $\to$ LLM) with a lightweight rate predictor. This small auxiliary model (typically a distilled BERT or MLP) performs extractive compression, filtering the...	03-14 08:27	Success	-	View
exp_pytrain.20260314082428.006_20260314_082520 Paper: pytrain.20260314082428.006	Python Skill Fallback Title: Generic Pipeline Engine with Dynamic Virtual Packaging - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 08:25	Success	-	View
exp_2510.16439v4_20260314_081227 Paper: 2510.16439v4	FrugalPrompt Benchmark Architecture: FrugalPrompt is a prompt compression framework using token attribution methods (specifically GlobEnc and DecompX). It operates as a preprocessing layer that scores input tokens for semantic salience and retains only the to...	03-14 08:22	Success	-	View
exp_pytrain.20260314080948.005_20260314_081032 Paper: pytrain.20260314080948.005	StrictTypeRegistry: Protocol-Based Plugin System Overview This benchmark evaluates the implementation of a robust, structural subtyping-based plugin manager using Python's standard `typing.Protocol`. The goal is to enforce strict interface adherence without relying on external meta-pr...	03-14 08:10	Success	-	View
exp_2511.13223v1_20260314_080802 Paper: 2511.13223v1	This benchmark is designed to simulate the inference stage of a Reasoning LLM. It compares the computational cost (VRAM... TokenSqueeze optimizes reasoning LLMs (e.g., DeepSeek-R1) by training them to generate concise Chain-of-Thought (CoT) traces, addressing the high memory and latency costs of long reasoning sequences. * Architecture: A two-stage trai...	03-14 08:08	Success	-	View
exp_2511.17885v1_20260314_080701 Paper: 2511.17885v1	FastMMoE: Accelerating Multimodal LLMs Benchmark Architecture: FastMMoE is a training-free accelerator for MoE-based Multimodal LLMs (e.g., DeepSeek-VL2). It optimizes inference through Routing-Aware Token Pruning, which clusters and removes visual tokens sharing high routing prob...	03-14 08:07	Success	-	View
exp_2409.00855v1_20260314_080552 Paper: 2409.00855v1	LanguaShrink: Reducing Token Overhead with Psycholinguistics Architecture: LanguaShrink proposes a task-agnostic compression framework utilizing psycholinguistic principles (the Ebbinghaus memory curve) and Part-of-Speech (POS) tagging to score token importance. It employs a chunk-based algorithm...	03-14 08:05	Success	-	View
exp_pytrain.20260314080234.004_20260314_080310 Paper: pytrain.20260314080234.004	Strictly-Typed Pipeline with Namespace Hygiene This benchmark evaluates a candidate's ability to construct a robust, modular data processing pipeline using advanced Python type hinting features and strict namespace controls. Objectives 1. Type Safety: Define strict `Protocol` interf...	03-14 08:03	Success	-	View
exp_2510.10448v1_20260314_080101 Paper: 2510.10448v1	RECON: Reasoning with Condensation for Efficient Retrieval-Augmented Generation Architecture & Retrieval: RECON modifies the standard RAG pipeline by inserting a learned condenser module between retrieval and generation. Utilizing the Search-R1 framework, it employs a distillation-trained summarizer to compre...	03-14 08:01	Success	-	View
exp_2511.06029v3_20260314_075936 Paper: 2511.06029v3	This benchmark evaluates the Lethe framework, focusing on its Layer- and Time-Adaptive KV Cache Pruning for LLMs. It... Architecture: Lethe introduces a dynamic KV cache management framework with two distinct dimensions of adaptivity: 1. Spatial (Layer-wise): Allocates token pruning budgets individually per layer based on estimated attention redundan...	03-14 08:00	Success	-	View
exp_2511.12869v2_20260314_075848 Paper: 2511.12869v2	On the Fundamental Limits of LLMs at Scale Architecture & Memory: This paper provides a theoretical proof that LLM scaling is fundamentally bounded by computability and information theory. It characterizes "context compression" as a geometric limit, proving that effective contex...	03-14 07:58	Success	-	View
exp_pytrain.20260314075558.003_20260314_075635 Paper: pytrain.20260314075558.003	Generic Plugin Loader with Runtime Type Enforcement This benchmark demonstrates a robust, modular architecture for discovering and loading Python plugins dynamically at runtime. It leverages `importlib` for filesystem-based discovery and `typing.Protocol` for structural subtyping (duck typin...	03-14 07:56	Success	-	View
exp_2511.18936v1_20260314_075430 Paper: 2511.18936v1	SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression Architecture: SWAN introduces a fine-tuning-free framework utilizing an offline orthogonal matrix to rotate and prune the KV-cache. It augments this sparse data with a small, fixed-size dense buffer to maintain retrieval accuracy. **Mem...	03-14 07:54	Success	-	View
exp_2505.08261v1_20260314_075317 Paper: 2505.08261v1	Enhancing Cache-Augmented Generation (CAG) with Adaptive Contextual Compression Architecture: Proposes a Hybrid CAG-RAG Framework utilizing Adaptive Contextual Compression (ACC). The system preloads static knowledge into the context window (CAG) but activates selective retrieval for dynamic or missing i...	03-14 07:53	Success	-	View
exp_2505.18092v2_20260314_075232 Paper: 2505.18092v2	QwenLong-CPRS Benchmark Suite Architecture: QwenLong-CPRS is a compression framework featuring Bidirectional Reasoning Layers and Token Critics (using LM heads) to perform dynamic, natural language-guided context pruning. It utilizes **Window-Parallel Infere...	03-14 07:52	Success	-	View
exp_2407.21118v2_20260314_075147 Paper: 2407.21118v2	Palu: Compressing KV-Cache with Low-Rank Projection Architecture: Palu targets hidden-dimension redundancy by decomposing projection matrices into low-rank components. It caches compressed Key/Value states and reconstructs full tensors on-the-fly during attention. The framework utilizes...	03-14 07:51	Success	-	View
exp_pytrain.20260314074858.002_20260314_074929 Paper: pytrain.20260314074858.002	Python Skill Fallback Title: Modern Generic Data Structures with PEP 695 - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 07:49	Success	-	View
exp_2402.18096v1_20260314_074714 Paper: 2402.18096v1	Benchmark: Mixed-Precision KV Cache (MiKV) Simulation Architecture: MiKV proposes an importance-aware mixed-precision quantization scheme. Instead of discarding "unimportant" tokens, the architecture retains the full KV context but stores high-importance pairs in high precision (e.g., FP16...	03-14 07:47	Success	-	View
exp_2505.23416v2_20260314_074555 Paper: 2505.23416v2	KVzip Benchmark Suite Architecture: KVzip is a query-agnostic eviction method that compresses KV caches based on a reconstruction proxy. It quantifies token importance by using the underlying LLM to reconstruct the original context from the KV cache; tok...	03-14 07:46	Success	-	View
exp_2408.15491v1_20260314_074448 Paper: 2408.15491v1	Instruction-Aware Contextual Compression Benchmark Architecture: Introduces Instruction-Aware Contextual Compression, a lightweight filter module designed to sit between the retriever and the LLM. It uses the instruction prompt to identify and prune irrelevant segments from retrieve...	03-14 07:45	Success	-	View
exp_cr_10.1145_3759441.3759448_20260314_074406 Paper: cr_10.1145_3759441.3759448	EMPIRIC: Exploring Missing Pieces in KV Cache Compression for Reducing Computation, Storage, and Latency in Long-Context... Architecture: An oracle-based framework extending RocketKV, analyzing intrinsic attention head patterns to define theoretical bounds for optimal KV cache eviction. Memory Footprint: Significantly reduces VRAM usage by validating agg...	03-14 07:44	Success	-	View
exp_pytrain.20260314074150.001_20260314_074239 Paper: pytrain.20260314074150.001	Python Skill Fallback Title: Strictly Typed Configuration Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 07:42	Success	-	View
exp_cr_10.1145_3759441.3759448_20260314_073316 Paper: cr_10.1145_3759441.3759448	Benchmark: EMPIRIC KV Cache Compression Architecture: An oracle-based framework extending RocketKV, analyzing intrinsic attention head patterns to define theoretical bounds for optimal KV cache eviction. Memory Footprint: Significantly reduces VRAM usage by validating agg...	03-14 07:33	Pending	-	View
exp_2506.08373v3_20260314_073210 Paper: 2506.08373v3	Draft-based Approximate Inference for LLMs Architecture: Introduces a draft-based framework using a small auxiliary model (e.g., 1-3B) to perform lookahead importance estimation for a larger target model. It proposes SpecKV (KV cache eviction), SpecPC (prompt token pruni...	03-14 07:32	Success	-	View
exp_pytrain.20260314072923.005_20260314_072958 Paper: pytrain.20260314072923.005	Typed Package Bootstrapper Overview This benchmark evaluates a Python system's ability to synthesize a standard-compliant Python project structure. It rigorously validates metadata configuration using `typing.TypedDict` schemas before generating filesystem artifacts....	03-14 07:30	Success	-	View
exp_pytrain.20260314065457.004_20260314_065525 Paper: pytrain.20260314065457.004	Strictly-Typed Dynamic Package Loader Benchmark Objective This benchmark evaluates an autonomous agent's ability to programmatically construct a valid Python package structure on the filesystem, utilize the `importlib` standard library for dynamic module loading, and enforce strict runti...	03-14 06:55	Success	-	View
exp_pytrain.20260314064115.003_20260314_064149 Paper: pytrain.20260314064115.003	Python Skill Fallback Title: Strictly-Typed CLI Dispatcher with ParamSpec - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 06:41	Success	-	View
exp_pytrain.20260314063413.002_20260314_063448 Paper: pytrain.20260314063413.002	PEP 695 Generic Package Scaffolder This coding drill benchmarks the developer experience and code robustness improvements offered by PEP 695 Type Parameter Syntax (introduced in Python 3.12). Hypothesis Adopting PEP 695 syntax (using square brackets for generics and `typ...	03-14 06:34	Success	-	View
exp_pytrain.20260314062740.001_20260314_062802 Paper: pytrain.20260314062740.001	Python Skill Fallback Title: Robust Plugin Loader with Strict Type Safety - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-14 06:28	Success	-	View
exp_2506.16636v1_20260313_105745 Paper: 2506.16636v1	This benchmark evaluates the performance of Masked Autoregressive Flows (MAF) utilizing the Latent Noise Injection (LNI)... Architecture The method relies on Masked Autoregressive Flows (MAF). Rather than standard generative sampling, it proposes a "Latent Noise Injection" (LNI) technique: encoding specific observed data points into the latent space, app...	03-13 10:57	Success	-	View
exp_pytrain.20260313105503.016_20260313_105531 Paper: pytrain.20260313105503.016	Robust Dynamic Plugin Registry with importlib Overview This drill demonstrates the construction of a modular, type-safe plugin loader using Python's standard library. It bridges the gap between dynamic runtime imports and static type checking by leveraging `typing.Protocol` for structu...	03-13 10:55	Success	-	View
exp_2506.16584v1_20260313_105421 Paper: 2506.16584v1	Benchmark: Semantic Stability on Constrained Hardware Architecture & Methodology This paper does not propose a new model architecture. Instead, it introduces a Variance Decomposition Framework, an evaluation methodology designed to measure semantic grounding. It assesses whether an LLM...	03-13 10:54	Success	-	View
exp_oa_W4412056540_20260313_105243 Paper: oa_W4412056540	Backfill Candidate oa_W4412056540 This paper analyzes the shift to data-centric AI, identifying key bottlenecks for embedded and real-time systems relevant to the ARES 8GB roadmap. Architecture & Memory: The authors argue that while training faces data scarcity, inferen...	03-13 10:52	Success	-	View
exp_hf_2603.09400_20260313_105158 Paper: hf_2603.09400	Backfill Candidate hf_2603.09400 Architecture: StateFactory utilizes an LLM to transform unstructured observations into factorized, hierarchical object-attribute structures. Instead of discriminative training, it computes rewards as semantic similarity between the...	03-13 10:52	Success	-	View
exp_2309.16859v1_20260313_105059 Paper: 2309.16859v1	Benchmark: Identity-Conditioned HyperNeRF (Backfill Candidate 2309.16859v1) Architecture: Utilizes an identity-conditioned hypernetwork to generate NeRF weights, learning a volumetric latent space of facial geometry and appearance from a low-res multi-view dataset. Memory Footprint: High Risk. While the...	03-13 10:51	Success	-	View
exp_cr_10.1515_jiip-2022-0050_20260313_105015 Paper: cr_10.1515_jiip-2022-0050	Multi-Fidelity Bayesian Inference Benchmark Architecture Proposes a multi-fidelity framework combining a low-fidelity Deep Neural Network (DNN) surrogate with a high-fidelity physical model for Bayesian inference on elastic properties. The DNN handles the bulk of the prior distri...	03-13 10:50	Success	-	View
exp_pytrain.20260313104750.015_20260313_104834 Paper: pytrain.20260313104750.015	Python Skill Fallback Title: PEP 695 Generic API with Public Interface Control - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-13 10:48	Success	-	View
exp_2403.18096v1_20260313_104620 Paper: 2403.18096v1	Benchmark: Cascade Temporal Filtering (Backfill Candidate 2403.18096v1) Summary for ARES 8GB Roadmap Architecture: The paper proposes a "cascade temporal filtering" method using dual-time dimensions (isochronal and chronological) to distinguish short- and long-term human activity. Crucially, it function...	03-13 10:46	Success	-	View
exp_2409.14586v1_20260313_104534 Paper: 2409.14586v1	Backfill Candidate 2409.14586v1 Architecture: Introduces a single `[RESET]` token to the vocabulary. Training (SFT/DPO) conditions the model to emit this token to abort unsafe continuations and restart generation, effectively adding a "self-correct" loop without struc...	03-13 10:45	Success	-	View
exp_2409.14538v1_20260313_104439 Paper: 2409.14538v1	Benchmark: HMDC (Heterogeneous Multi-model Dataset Condensation) Architecture: HMDC proposes a framework for generating model-agnostic condensed datasets by utilizing multiple heterogeneous architectures simultaneously. To resolve conflicts between diverse models, it introduces a Gradient Balance Mod...	03-13 10:44	Success	-	View
exp_oa_W4403322739_20260313_104306 Paper: oa_W4403322739	This benchmark evaluates the inference performance (memory footprint and generation speed) of a standard Transformer-bas... This survey evaluates generative LLM architectures (specifically GPT and Llama series) and their inference performance across diverse hardware platforms (CPU, GPU, FPGA, ASIC, PIM). * Architecture: Focuses on standard Transformer-based...	03-13 10:43	Success	-	View
exp_pytrain.20260313104107.014_20260313_104135 Paper: pytrain.20260313104107.014	Self-Validating Plugin Registry with Strict Typing Overview This benchmark demonstrates the implementation of a type-safe, modular plugin architecture using Python's standard library. It leverages `typing.Protocol` for structural subtyping and `importlib` for dynamic runtime introspection a...	03-13 10:41	Success	-	View
exp_cr_10.58414_scientifictemper.2025.16.2.03_20260313_103952 Paper: cr_10.58414_scientifictemper.2025.16.2.03	MRMGKTL Benchmark Analysis for ARES 8GB Roadmap * Architecture: The MRMGKTL model combines a standard Transformer encoder with a Gaussian Kernel classifier. Crucially, it utilizes a pre-processing pipeline involving Sokal–Michener’s multivariate reli...	03-13 10:39	Success	-	View
exp_2506.16594v2_20260313_103754 Paper: 2506.16594v2	Benchmark: Efficient Local Biomedical Inference This paper is a scoping review, not a technical architecture proposal. Consequently, it provides no specific data regarding model architecture, memory footprint, or inference speed required for the ARES 8GB roadmap. * **Architecture...	03-13 10:39	Success	-	View
exp_2506.16575v1_20260313_103712 Paper: 2506.16575v1	Benchmark for Elo-Based Harmful Content Detection Workflow Paper Summary: Elo Rating System for Harmful Content Detection Architecture: The paper proposes an inference workflow utilizing an Elo rating system to rank and select optimal LLM responses for detecting harmful content (microaggres...	03-13 10:37	Success	-	View
exp_pytrain.20260313103445.013_20260313_103506 Paper: pytrain.20260313103445.013	Strictly Typed Backend Registry with Runtime Validation This benchmark demonstrates a robust, pluggable architecture simulation using Python's `typing.Protocol` for structural subtyping. It implements a `KernelRegistry` that enforces strict type checking at registration time, ensuring that only...	03-13 10:35	Success	-	View
exp_2506.16571v2_20260313_102932 Paper: 2506.16571v2	Benchmark: Visualization Rationale Extraction Paper Analysis: Capturing Visualization Design Rationale This paper introduces a methodology and dataset for extracting visualization design rationales from student notebooks, creating a corpus of Question-Answer-Rationale triples usi...	03-13 10:33	Success	-	View
exp_pytrain.20260313102718.012_20260313_102745 Paper: pytrain.20260313102718.012	Dynamic Type-Safe Component Loader Overview This benchmark implements a robust, self-contained plugin architecture using Python's standard library. It demonstrates advanced use of `importlib` for dynamic module loading from arbitrary file paths and `typing.Protocol` for stru...	03-13 10:27	Success	-	View
exp_cr_10.1038_s41698-025-01103-4_20260313_102301 Paper: cr_10.1038_s41698-025-01103-4	LLM-AIx Pipeline Benchmark: Local Privacy-Preserving Extraction Summary: LLM-AIx Pipeline for Oncology * Architecture: The paper outlines LLM-AIx, a software protocol acting as a wrapper for open-source, privacy-preserving LLMs. It is designed to extract structured clinical data (e.g., TNM s...	03-13 10:25	Success	-	View
exp_2512.14954v1_20260313_102220 Paper: 2512.14954v1	Backfill Candidate 2512.14954v1 Summary for ARES 8GB Roadmap Architecture: Proposes a probabilistic framework to align teacher and student probability spaces across distinct tokenizers. By exploiting the recursive structure of Byte-Pair Encoding (BPE), it enables...	03-13 10:22	Success	-	View
exp_hf_2603.09221_20260313_102122 Paper: hf_2603.09221	Test-Time Control (TTC) Layer Benchmark Architecture The paper introduces the Test-Time Control (TTC) layer, an adapter that integrates finite-horizon LQR planning into pretrained LLMs. Instead of relying solely on associative recall, the architecture projects future late...	03-13 10:21	Success	-	View
exp_hf_2603.08942_20260313_102018 Paper: hf_2603.08942	Benchmark: BiCLIP (Geometric Domain Alignment) Architecture BiCLIP functions as a lightweight wrapper for frozen Vision-Language Models (VLMs). It operates on the principle of "domain canonicalization," learning a structured geometric transformation matrix to align image-text featur...	03-13 10:20	Success	-	View
exp_pytrain.20260313101806.011_20260313_101833 Paper: pytrain.20260313101806.011	Dynamic Protocol Validator & Package Generator This benchmark validates a candidate's ability to bridge static type definitions with dynamic code execution. It simulates a plugin system where Python code is generated on-the-fly, written to the filesystem, and loaded dynamically using `i...	03-13 10:18	Success	-	View
exp_2303.10944v3_20260313_101631 Paper: 2303.10944v3	Benchmark: Pix2SG Architecture Evaluation Architecture: Pix2SG utilizes a standard Transformer Encoder-Decoder architecture. It treats Scene Graph Generation (SGG) as an autoregressive sequence-to-sequence task, converting image patches directly into a sequence of (subject,...	03-13 10:16	Success	-	View
exp_2309.16175v1_20260313_101535 Paper: 2309.16175v1	Backfill Candidate 2309.16175v1 Summary for ARES 8GB Roadmap: This paper details a data-centric training pipeline for biomedical QA (COVID-19), focusing on weak supervision and augmentation rather than inference architecture optimization. * Architecture: Stand...	03-13 10:15	Success	-	View
exp_cr_10.60027_ijsasr.2025.7518_20260313_101450 Paper: cr_10.60027_ijsasr.2025.7518	Benchmark: Blended Learning Curriculum Simulation Assessment: Irrelevant to Inference Roadmap This document is an educational pedagogical study, not a technical AI paper. It evaluates the efficacy of a blended learning curriculum for library science students at Zhoukou Normal Unive...	03-13 10:14	Success	-	View
exp_2506.16593v1_20260313_101407 Paper: 2506.16593v1	ARES 8GB Roadmap: Physical System Identification Benchmark Summary for ARES 8GB Roadmap Focus: Physical System Identification & Uncertainty Quantification (Classical/Model-based, not Deep Learning). * Architecture: Proposes a lightweight mathematical "transfer function" linking velocity...	03-13 10:14	Success	-	View
exp_pytrain.20260313101122.010_20260313_101156 Paper: pytrain.20260313101122.010	Typed Asynchronous Plugin Architecture Overview This benchmark demonstrates a robust, extensible plugin system using Structural Subtyping (Protocol) and Asynchronous I/O (asyncio). Features * Protocol Enforcement: Uses `typing.Protocol` to define the `Plugin` interfa...	03-13 10:12	Success	-	View
exp_2304.00320v1_20260313_095955 Paper: 2304.00320v1	Benchmark: Backfill Candidate 2304.00320v1 (SGD as SDE) Architecture: Theoretical analysis of training dynamics, not a network design. Proposes modeling SGD as a Stochastic Differential Equation (SDE) with two diffusion terms (mini-batch sampling and unbiased label noise). **Memory Footprint...	03-13 10:09	Success	-	View
exp_2309.16849v2_20260313_095842 Paper: 2309.16849v2	Benchmark: Shifted Non-Local Search (SNLS) vs. Standard Attention Architecture: Proposes Shifted Non-Local Search (SNLS), a hybrid space-time attention mechanism. It predicts global offsets for long-range motion and refines them via a corrective local grid search. This acts as a drop-in replacemen...	03-13 09:58	Success	-	View
exp_pytrain.20260313095549.009_20260313_095627 Paper: pytrain.20260313095549.009	Type-Safe Dynamic Extension Loader This benchmark validates the hypothesis that Python's `typing.Protocol` combined with `importlib` can be used to create a robust, zero-dependency plugin architecture. Objective To design a runtime system that: 1. Defines a strict structural...	03-13 09:56	Success	-	View
exp_2403.18148v1_20260313_094810 Paper: 2403.18148v1	Benchmark Design: Feasibility of Local Empathic Models Paper Type: Behavioral Evaluation (Not an architectural proposal). Summary: This study compares empathic response generation in existing LLMs (GPT-4 Turbo, Llama 2, Mistral) against human benchmarks. It does not introduce new archit...	03-13 09:53	Success	-	View
exp_2403.18125v1_20260313_094724 Paper: 2403.18125v1	Benchmark for Digital Newcomer Queries Relevance: Low (Data Resource). Assessment: This paper proposes a dataset of "digital newcomer" queries to study LLM robustness against non-standard language. It does not present a model architecture or optimization technique. *...	03-13 09:47	Success	-	View
exp_cr_10.3390_s24072091_20260313_094645 Paper: cr_10.3390_s24072091	Benchmark: Lightweight BNN for Structural Health Monitoring (SHM) Paper Analysis: BNNs for Structural Health Monitoring (SHM) Architecture: The paper proposes a Bayesian Neural Network (BNN) utilizing probabilistic inference to predict structural displacement. It operates within a "dual-drive"...	03-13 09:46	Success	-	View
exp_pytrain.20260313094430.008_20260313_094503 Paper: pytrain.20260313094430.008	Dynamic Typed Plugin Loader with PEP 695 This benchmark verifies the hypothesis that combining dynamic module loading (`importlib`) with modern type parameter syntax (PEP 695) results in a robust, performant, and extensible plugin architecture. Hypothesis Dynamic generation and ex...	03-13 09:45	Success	-	View
exp_cr_10.36724_2072-8735-2024-18-3-41-49_20260313_094308 Paper: cr_10.36724_2072-8735-2024-18-3-41-49	Backfill Candidate cr_10.36724_2072-8735-2024-18-3-41-49 Status: Irrelevant This paper addresses telecommunications protocols (specifically queueing theory and traffic shaping for high-throughput satellites), not Deep Learning. * Architecture: N/A. The paper proposes a mathematical pr...	03-13 09:43	Success	-	View
exp_cr_10.1609_aaai.v38i16.29810_20260313_094122 Paper: cr_10.1609_aaai.v38i16.29810	Backfill Benchmark: Dynamic Layerwise Token Dropping Architecture: Framework-level intervention. Introduces "efficient data sampling" (curriculum learning) and "random layerwise token dropping" to optimize training data routing. It does not modify the underlying model architecture (e.g.,...	03-13 09:41	Success	-	View
exp_pytrain.20260313093752.007_20260313_093830 Paper: pytrain.20260313093752.007	Generic Component Pipeline Builder This benchmark evaluates the creation of a modular, type-safe data processing pipeline using Python's standard library. The goal is to design a framework that separates core logic from concrete implementations, leveraging advanced typing fe...	03-13 09:38	Success	-	View
exp_2409.14516v1_20260313_093231 Paper: 2409.14516v1	Benchmark: Local Feasibility of Phi-3-mini for Geospatial Planning Assessment: This paper evaluates GPT-4 and Phi-3-mini for geospatial and transportation planning tasks. * Architecture: The study contrasts the proprietary GPT-4 against Phi-3-mini, a lightweight transformer architecture optimized f...	03-13 09:36	Success	-	View
exp_2506.16628v1_20260313_093155 Paper: 2506.16628v1	Benchmark: Offline-LLM to Rule-Based Pipeline Architecture: Hybrid offline design. LLMs are utilized exclusively during the development phase to generate rules, identify relevant text snippets, and extract keywords. The production system is a traditional rule-based NLP pipeline (Re...	03-13 09:31	Success	-	View
exp_cr_10.3390_s25185786_20260313_093109 Paper: cr_10.3390_s25185786	Benchmark for MFT-Net (Tactile Sensing Architecture) Architecture The paper proposes MFT-Net, a hybrid architecture that integrates a Convolutional Neural Network (CNN) for local feature extraction with a Transformer module for global dependency modeling. It utilizes Squeeze-and-Excitatio...	03-13 09:31	Success	-	View
exp_pytrain.20260313092922.006_20260313_092949 Paper: pytrain.20260313092922.006	Python Skill Fallback Title: Strictly-Typed Component Registry with Dynamic Import Mechanics - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-13 09:29	Success	-	View
exp_2512.14961v3_20260313_092842 Paper: 2512.14961v3	Benchmark: Hybrid Trimodal Fusion (Backfill 2512.14961v3) Architecture: Utilizes a hybrid trimodal framework (face, voice, motion) with independent encoders feeding into a cross-attention and gated fusion module. It employs a single classification head with a confidence-weighted strategy to dy...	03-13 09:28	Success	-	View
exp_cr_10.1609_aaai.v37i4.25597_20260313_092728 Paper: cr_10.1609_aaai.v37i4.25597	Efficient Dual-Encoder CLIP with Visual Prompting Architecture & Retrieval Strategy: This paper proposes a dual-encoder architecture fine-tuning a frozen CLIP backbone. The retrieval mechanism converts the reference image into a learnable visual prompt which is prefixed to...	03-13 09:27	Success	-	View
exp_2506.12724v1_20260313_092646 Paper: 2506.12724v1	Dynamic Modality Scheduling (DMS) Benchmark Architecture: Dynamic Modality Scheduling (DMS) is a model-agnostic wrapper for Multimodal LLMs (e.g., LLaVA, BLIP-2). It uses a scheduler to weigh modality contributions based on three signals: predictive entropy (confidence), Monte Ca...	03-13 09:26	Success	-	View
exp_2304.00387v1_20260313_092545 Paper: 2304.00387v1	Benchmark for HaLP (Hallucinating Latent Positives) Architecture: Introduces a lightweight augmentation-free contrastive learning framework. The HaLP module hallucinates synthetic positive samples directly in the latent space using a closed-form solver, replacing the need for complex geo...	03-13 09:25	Success	-	View
exp_2404.00057v1_20260313_092455 Paper: 2404.00057v1	Backfill Candidate 2404.00057v1 Architecture: Proposes a cloud-centric OS architecture integrating LLMs via declarative interfaces and self-adaptive kernels. The system prioritizes personalized intelligence by decoupling the decision-making layer from local hardwa...	03-13 09:24	Success	-	View
exp_pytrain.20260313092254.005_20260313_092314 Paper: pytrain.20260313092254.005	Generic Plugin Registry with Protocol Enforcement This benchmark tests the implementation of a modular, type-safe plugin system using Python's `typing.Protocol`, `typing.TypeVar`, and `typing.Generic`. Objectives 1. Structural Subtyping: Define a strict interface using `Protocol` that...	03-13 09:23	Success	-	View
exp_cr_10.3390_en18184924_20260313_091823 Paper: cr_10.3390_en18184924	Hybrid Monte Carlo & Clustering Time-Series Forecasting Architecture: The proposed model is a hybrid statistical system combining Monte Carlo filters for state estimation with a clustering algorithm (likely K-Means or similar) for outlier removal and forecasting. It is not a neural network o...	03-13 09:21	Success	-	View
exp_cr_10.36676_jrps.v15.i3.1520_20260313_091726 Paper: cr_10.36676_jrps.v15.i3.1520	Benchmark: Content-Based Image Retrieval (CBIR) with Lightweight Feature Extraction Paper Type: Literature Survey (Not a specific implementation). * Architecture: Analyzes Deep Learning feature extractors (CNNs/ViTs) and handcrafted features. No specific architecture proposed for deployment. * **Retrieval Architect...	03-13 09:17	Success	-	View
exp_cr_10.17588_2072-2672.2023.3.062-067_20260313_091651 Paper: cr_10.17588_2072-2672.2023.3.062-067	Innovation Benchmark: Classical HVAC State-Space Control Assessment: Reject for ARES Roadmap. This paper concerns physical control theory (HVAC), not AI workloads. * Architecture: Classical (State-Space & Transfer Functions). The "model" consists of differential equations derived from the...	03-13 09:16	Success	-	View
exp_cr_10.3390_agronomy14040673_20260313_091607 Paper: cr_10.3390_agronomy14040673	Backfill Candidate cr_10.3390_agronomy14040673 Architecture: Hybrid framework combining a Densely Connected CNN for multilevel local feature extraction with a Transformer module for global context capture. A Cycle-GAN is utilized for training data augmentation but is excluded during...	03-13 09:16	Success	-	View
exp_pytrain.20260313091410.004_20260313_091433 Paper: pytrain.20260313091410.004	Python Skill Fallback Title: Strictly Typed ZipApp Packager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-13 09:14	Success	-	View
exp_cr_10.24425_jppr.2024.151253_20260313_091249 Paper: cr_10.24425_jppr.2024.151253	Backfill Candidate cr_10.24425_jppr.2024.151253 Architecture: Modifies the YOLOv5m baseline by integrating a Swin Transformer (Swin-T) module into the backbone network. It also utilizes K-means++ for anchor optimization and Efficient IoU (EIoU) loss to improve bounding box regression...	03-13 09:12	Success	-	View
exp_2506.16597v1_20260313_091202 Paper: 2506.16597v1	Backfill Candidate 2506.16597v1 Paper: Exoplanet Classification through Vision Transformers with Temporal Image Analysis Architecture: The proposed pipeline converts 1D Kepler light curves into 2D Recurrence Plots (RPs) or Gramian Angular Fields (GAFs) to serve as...	03-13 09:12	Success	-	View
exp_cr_10.3390_rs17183200_20260313_091118 Paper: cr_10.3390_rs17183200	TransMambaCNN Architecture Benchmark Architecture TransMambaCNN utilizes a dual-branch topology to fuse global and local spatiotemporal features. The global branch replaces standard self-attention with a Convolutional State-Space Module (C-SSM), combining an Attentive...	03-13 09:11	Success	-	View
exp_2512.14908v5_20260313_091038 Paper: 2512.14908v5	Backfill Candidate 2512.14908v5 Architecture: ATLAS is a propagation-free framework replacing message passing with multi-resolution community features. It utilizes modularity-guided search to identify optimal community scales, projects these structures into embeddings...	03-13 09:10	Success	-	View
exp_2303.10699v1_20260313_090945 Paper: 2303.10699v1	Backfill Candidate 2303.10699v1 Architecture: This paper introduces a dataset augmentation strategy (FVQA 2.0) for Fact-based VQA, addressing model vulnerability to imbalanced Knowledge Graph (KG) distributions. The underlying architecture employs a Dual-Encoder s...	03-13 09:09	Success	-	View
exp_pytrain.20260313090742.003_20260313_090810 Paper: pytrain.20260313090742.003	Type-Introspective Package Manifestor Overview This benchmark validates the hypothesis that Python's standard library `typing` and `inspect` modules are sufficient to build robust, type-safe packaging utilities without external dependencies. Objective Implement a lightweight pa...	03-13 09:08	Success	-	View
exp_2506.17336v3_20260313_090606 Paper: 2506.17336v3	Backfill Candidate 2506.17336v3 Architecture: Hybrid system splitting computation between a remote strong LLM (GPT-4o) for "Socratic CoT" query planning and a local Llama-3.2-1B for final response generation. Retrieval Strategy: Uses **Homomorphically Encrypte...	03-13 09:06	Success	-	View
exp_2506.13467v1_20260313_090446 Paper: 2506.13467v1	NeuroEmbed Bi-Encoder Benchmark NeuroEmbed fine-tunes PubMedBERT for semantic retrieval of biomedical cohorts. * Architecture: Bi-encoder (PubMedBERT) fine-tuned on synthetically generated QA pairs derived from ontology-aligned metadata. * **Retrieval Strategy...	03-13 09:05	Success	-	View
exp_2304.01222v1_20260313_090354 Paper: 2304.01222v1	Benchmark: NeuroDAVIS (2304.01222v1) Architecture NeuroDAVIS employs an unsupervised deep neural network designed for dimensionality reduction. It extracts features non-linearly, theoretically preserving high-dimensional neighborhood relationships (local and global structu...	03-13 09:04	Success	-	View
exp_2304.06724v1_20260313_090305 Paper: 2304.06724v1	Backfill Candidate 2304.06724v1 Assessment: High-Risk Vulnerability for Dynamic Architectures Architecture: GradMDM targets Dynamic Neural Networks (DNNs)—models designed to skip layers or adapt width to save resources. The attack manipulates gradient directio...	03-13 09:03	Success	-	View
exp_pytrain.20260313090040.002_20260313_090104 Paper: pytrain.20260313090040.002	PEP 695 Generic Repository Benchmark This benchmark tests the implementation of Python 3.12's PEP 695 Type Parameter Syntax within a single-file module structure. Features * PEP 695 Syntax: Uses the new `class ClassName[T]:` and `type Alias[T] = ...` syntax. * **Module Enc...	03-13 09:01	Success	-	View
exp_2309.16804v2_20260313_084913 Paper: 2309.16804v2	Benchmark Candidate 2309.16804v2 Architecture: A pipeline fine-tuning an unspecified open-source model on synthetic dialogues derived from textbooks. The specific base architecture is redacted in this excerpt. Memory Footprint: No explicit VRAM usage is detailed. F...	03-13 08:59	Success	-	View
exp_cr_10.1609_aaai.v38i12.29197_20260313_084834 Paper: cr_10.1609_aaai.v38i12.29197	FLAME Architecture Benchmark Architecture: FLAME is a 60M parameter Transformer optimized specifically for Excel formulas. Key architectural differentiators include an Excel-specific tokenizer and domain-adapted pre-training objectives: masked span prediction and n...	03-13 08:48	Success	-	View
exp_pytrain.20260313084613.001_20260313_084638 Paper: pytrain.20260313084613.001	Type-Safe Virtual Package Builder Benchmark Overview This benchmark demonstrates the ability to construct a Python package entirely in memory, inject it into the runtime environment, and enforce strict type constraints using `typing.Protocol` and Generics. It simulates a build proces...	03-13 08:46	Success	-	View
exp_cr_10.1609_aaai.v38i12.29197_20260313_083849 Paper: cr_10.1609_aaai.v38i12.29197	FLAME Architecture Benchmark Architecture: FLAME is a 60M parameter Transformer optimized specifically for Excel formulas. Key architectural differentiators include an Excel-specific tokenizer and domain-adapted pre-training objectives: masked span prediction and n...	03-13 08:44	Pending	-	View
exp_cr_10.1609_aaai.v38i12.29197_20260313_083809 Paper: cr_10.1609_aaai.v38i12.29197	Backfill Candidate cr_10.1609_aaai.v38i12.29197 Architecture: FLAME is a 60M parameter Transformer optimized specifically for Excel formulas. Key architectural differentiators include an Excel-specific tokenizer and domain-adapted pre-training objectives: masked span prediction and n...	03-13 08:38	Success	-	View
exp_pytrain.20260313083547.003_20260313_083620 Paper: pytrain.20260313083547.003	Robust Typed Plugin Loader with `importlib` This benchmark tests the ability to design a flexible plugin architecture using Python's standard library. The solution must dynamically generate a module in a temporary filesystem context, load it using low-level import utilities, and vali...	03-13 08:36	Success	-	View
exp_oa_W4404574673_20260313_083420 Paper: oa_W4404574673	Backfill Candidate oa_W4404574673 Analysis for ARES 8GB Roadmap * Architecture: The survey reviews standard Transformer-based architectures and pre-training objectives. It identifies multilingual capabilities primarily as a result of data quality, diversity, and ali...	03-13 08:34	Success	-	View
exp_2506.16655v1_20260313_083303 Paper: 2506.16655v1	Arch-Router v1.0 Benchmark Architecture Arch-Router is a compact 1.5B parameter model functioning as a classifier. Instead of generating text, it maps user queries to specific domains (e.g., travel) or action types to select the most appropriate downstream model...	03-13 08:33	Success	-	View
exp_2506.16596v3_20260313_083145 Paper: 2506.16596v3	Cyc-like Knowledge Infrastructure Benchmark This paper outlines a community-driven vision for a modern Cyc-like knowledge infrastructure to address LLM hallucinations and reasoning gaps. * Architecture: Proposes an "open engineering framework" integrating modular Knowledge Repres...	03-13 08:32	Success	-	View
exp_pytrain.20260313082915.002_20260313_082954 Paper: pytrain.20260313082915.002	Generic Event Dispatcher with PEP 695 Syntax Overview This benchmark provides a reference implementation of a thread-safe Generic Event Dispatcher using Python 3.12's PEP 695 Type Parameter Syntax. Hypothesis Utilizing PEP 695 Type Parameter Syntax reduces generic type boilerplate...	03-13 08:29	Success	-	View
exp_2512.14880v1_20260313_082625 Paper: 2512.14880v1	Benchmark: Task Matrices for Efficient Model Specialization Architecture: Introduces "Task Matrices"—linear transformations that map base model embeddings to specific finetuned states. This allows a single base model to simulate the behavior of multiple specialized models by applying distinct li...	03-13 08:27	Success	-	View
exp_hf_2603.09555_20260313_082538 Paper: hf_2603.09555	Backfill Candidate hf_2603.09555 Architecture: Proposes a compiler-first implementation of Mamba-2, leveraging XLA's fusion and tiling passes to handle state space duality (diagonal structures, chunkable recurrence). This eliminates the need for hand-written CUDA or Tr...	03-13 08:25	Success	-	View
exp_2309.10945v1_20260313_082428 Paper: 2309.10945v1	Benchmark: Pirá 2.0 Bilingual Scientific QA Paper: Benchmarks for Pirá 2.0 Type: Dataset Release (No novel model architecture). Summary: This paper establishes baselines for the Pirá 2.0 dataset, a curated bilingual (English/Portuguese) resource for testing expert knowled...	03-13 08:24	Success	-	View
exp_pytrain.20260313082208.001_20260313_082233 Paper: pytrain.20260313082208.001	Strictly-Typed Dependency Resolver Benchmark This benchmark evaluates the ability of an autonomous coding system to implement a robust package dependency resolver using Python's standard library. The solution requires a strict type system (simulating `mypy --strict` compliance), a bac...	03-13 08:22	Success	-	View
exp_hf_2603.06854_20260313_072309 Paper: hf_2603.06854	Benchmark: Audio-Text Text-Dominance Mitigation (Steering Overhead) Architecture Proposes an inference-time activation steering mechanism to mitigate "text dominance" in Large Audio-Language Models (LALMs). It utilizes mechanistic interpretability to identify specific "audio-specialist" attention heads...	03-13 07:23	Pending	-	View
exp_hf_2603.10145_20260313_072159 Paper: hf_2603.10145	Backfill Candidate hf_2603.10145 Architecture: The paper identifies the standard LM Head (projection from hidden dimension $D$ to vocabulary $V$) as a fundamental "gradient bottleneck." Due to the $D \ll V$ mismatch, the rank-$D$ layer acts as a severe compressor durin...	03-13 07:22	Success	-	View
exp_2309.16812v1_20260313_072058 Paper: 2309.16812v1	Benchmark for Semantic Layout-to-Image Diffusion Architecture: Conditional Denoising Diffusion Probabilistic Model (DDPM) utilizing a U-Net backbone enhanced with adaptive normalization (likely SPADE-style) and self-attention mechanisms to integrate semantic layout conditioning. **Mem...	03-13 07:21	Success	-	View
exp_pytrain.20260313071750.090_20260313_071827 Paper: pytrain.20260313071750.090	Dynamic Typed Plugin Loader Objective The objective of this drill is to verify the ability to construct a robust Python plugin architecture that merges strict static typing definitions (using `typing.Protocol`, `TypeVar`, and Generics) with dynamic runtime module gene...	03-13 07:18	Success	-	View
exp_2403.18098v1_20260313_070552 Paper: 2403.18098v1	Legal Entailment Benchmark (COLIEE Task 4) Analysis: GPTs and Language Barrier (COLIEE Task 4) * Architecture: The paper evaluates generic "GPTs" (likely proprietary APIs or large base models) on a legal entailment task. No specific architectural modifications (e.g., pruning...	03-13 07:16	Success	-	View
exp_pytrain.20260313070305.089_20260313_070333 Paper: pytrain.20260313070305.089	Typed Dynamic Plugin Loader This benchmark demonstrates a robust, extensible plugin architecture that leverages Python's `typing.Protocol` for interface safety and `importlib` for dynamic runtime module loading. Objective To validate that dynamically loaded code—often...	03-13 07:03	Success	-	View
exp_cr_10.3390_app14188526_20260313_070104 Paper: cr_10.3390_app14188526	Backfill Candidate cr_10.3390_app14188526 Summary for ARES 8GB Roadmap * Architecture: The paper proposes a hybrid Long Short-Term Memory (LSTM) network integrated with a Self-Attention Mechanism (SA-LSTM). This architecture weights specific time-steps in the input...	03-13 07:01	Success	-	View
exp_2506.16592v1_20260313_070005 Paper: 2506.16592v1	Benchmark for DenseNet121 Attention-Enhanced Hybrid (Candidate 2506.16592v1) Architecture: Utilizes a hybrid design coupling a pre-trained DenseNet121 encoder with a multi-branch attention-enhanced decoder. The bottleneck employs Global Spatial Attention (GSA), Position Encoding, and Scaled Dot-Product Attention...	03-13 07:00	Success	-	View
exp_cr_10.1145_3768167_20260313_065845 Paper: cr_10.1145_3768167	Backfill Candidate cr_10.1145_3768167 Architecture The paper proposes a Graph-Transformer Network (GTN) acting as a surrogate model for circuit topology optimization. It encodes circuit physics specifically—voltage changes in loops and current flows—directly into graph embe...	03-13 06:59	Success	-	View
exp_pytrain.20260313065531.088_20260313_065602 Paper: pytrain.20260313065531.088	Generic Package Metadata Inspector A robust Python coding drill designed to test proficiency with the `importlib.metadata` standard library and modern Generics. Objective Implement a generic class `PackageMetadataInspector[T]` that performs introspection on installed Python...	03-13 06:56	Success	-	View
exp_cr_10.3390_s25185805_20260313_065334 Paper: cr_10.3390_s25185805	Benchmark for BLIP-2 Heterogeneous Input Fusion Architecture: Uses a customized BLIP-2 framework with a Q-Former to fuse heterogeneous inputs (visual frames, kinematic data) into low-dimensional embeddings representing "task demand" and "driving capability" within a shared latent...	03-13 06:53	Success	-	View
exp_2303.16839v3_20260313_065232 Paper: 2303.16839v3	Backfill Candidate 2303.16839v3 Architecture: A decoder-only multimodal model pairing a vision encoder with a unified text decoder. It utilizes a "two-pass" approach: the first pass extracts contrastive embeddings for retrieval, and the second pass performs autoregres...	03-13 06:52	Success	-	View
exp_2303.16576v2_20260313_065106 Paper: 2303.16576v2	Backfill Candidate 2303.16576v2 Architecture: WordStylist utilizes a Latent Diffusion Model (LDM) backbone, comprising a VAE for latent space compression and a U-Net denoiser. It conditions generation on writer style (via class indices) and text content, replacing adv...	03-13 06:51	Success	-	View
exp_pytrain.20260313064740.087_20260313_064818 Paper: pytrain.20260313064740.087	Python Skill Fallback Title: Dynamic Module Loader with Structural Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-13 06:48	Success	-	View
exp_2303.15132v1_20260313_064556 Paper: 2303.15132v1	Benchmark: Graph-based Label Propagation for ASR Rescoring Architecture Graph-based label propagation model operating on ASR N-best lists. Nodes represent hypotheses, and edges are weighted by cross-utterance acoustic similarity. This allows for collaborative rescoring, utilizing neighboring ut...	03-13 06:46	Success	-	View
exp_cr_10.1609_aaai.v38i17.29885_20260313_064508 Paper: cr_10.1609_aaai.v38i17.29885	Benchmark for Contrastive Confidence Regularizer (CCR) in Dense Retrieval Architecture: Dual-Encoder Dense Retrieval (Contrastive Learning). Retrieval Specifics: * Retrieval Architecture: Standard Dual-Encoder (bi-encoder) with vector similarity search. * Training Strategy: Introduces a "Contrasti...	03-13 06:45	Success	-	View
exp_2507.00033v1_20260313_064344 Paper: 2507.00033v1	Video LLM Context Optimization Benchmark Architecture: Proposes a Retrieval-Augmented Generation (RAG) pipeline where a lightweight text-to-video moment retrieval model acts as a "selector." It retrieves top-$k$ relevant video segments based on the query before passing...	03-13 06:44	Success	-	View
exp_2403.18134v1_20260313_064255 Paper: 2403.18134v1	GTI Block Benchmark Architecture: Proposes a Graph Transformer Integration (GTI) block for Multiple Instance Learning (MIL). It hybridizes a local Graph Convolutional Network (GCN) to model spatial relationships between neighboring tissue patches w...	03-13 06:43	Success	-	View
exp_pytrain.20260313063952.086_20260313_064030 Paper: pytrain.20260313063952.086	Dynamic Backend Resolution with Strict Typing and Metadata Checks This benchmark implements a self-contained "backend dispatcher" mechanism often found in high-performance ML frameworks like vLLM or Diffusers. Overview In production-grade inference engines, the system must dynamically select the most effi...	03-13 06:40	Success	-	View
exp_2409.14557v3_20260313_063753 Paper: 2409.14557v3	Backfill Candidate 2409.14557v3 Architecture: Proposes Exo-MDPs, decomposing state dynamics into independent stochastic (exogenous) and action-dependent deterministic (endogenous) components. Structurally equivalent to Linear Mixture MDPs, enabling linear function app...	03-13 06:37	Success	-	View
exp_cr_10.1609_aaai.v38i21.30443_20260313_063657 Paper: cr_10.1609_aaai.v38i21.30443	Backfill Candidate cr_10.1609_aaai.v38i21.30443 Summary for ARES 8GB Roadmap * Architecture: This research proposes a software-layer methodology rather than a neural architecture. It utilizes existing Transformer-based models, relying on structured prompt engineering (context...	03-13 06:37	Success	-	View
exp_cr_10.51519_journalisi.v7i1.1024_20260313_063618 Paper: cr_10.51519_journalisi.v7i1.1024	Backfill Candidate cr_10.51519_journalisi.v7i1.1024 Subject: IT-Based Knowledge Sharing System with LLM Integration Architecture: Conceptual system architecture proposing the integration of Large Language Models (specifically ChatGPT) into university IT ticketing systems. The design...	03-13 06:36	Success	-	View
exp_2506.16644v1_20260313_063517 Paper: 2506.16644v1	This benchmark simulates the SORE (Sentence-based Omission & Retrieval Engine) architecture. It replaces an autoregr... Architecture SORE replaces autoregressive LLMs with a dual-stage pipeline utilizing multilingual sentence encoders and Approximate Nearest Neighbor (ANN) search. It identifies core content via metadata embeddings and filters extraneous...	03-13 06:35	Success	-	View
exp_pytrain.20260313063309.085_20260313_063336 Paper: pytrain.20260313063309.085	Type-Safe ZipApp Packager Objective Create a Python function `build_distribution` that programmatically generates a `.pyz` (ZipApp) executable from a dictionary of virtual source files. Constraints - Standard Library Only: No external dependencies (e.g., no `myp...	03-13 06:33	Success	-	View
exp_2506.16580v1_20260313_063149 Paper: 2506.16580v1	Backfill Candidate 2506.16580v1 Architecture: Replaces standard encoder blocks with an Emformer (Efficient Memory Transformer) to enable chunk-based attention and streamable processing. The model utilizes a non-autoregressive decoder to parallelize output generati...	03-13 06:31	Success	-	View
exp_oa_W4415031789_20260313_062953 Paper: oa_W4415031789	Benchmark: T2I Architectures (Transformer vs. Mamba/SSM) Architecture: Surveys 141 T2I works (2021–2024), categorizing them into Autoregressive, GAN, and Diffusion foundations. Highlights Mamba and Multimodality as emerging architectures for future performance gains, potentially offering...	03-13 06:30	Success	-	View
exp_hf_2603.09906_20260313_062856 Paper: hf_2603.09906	Benchmark: Reasoning Token Memory & Speed Overhead Architecture: The paper analyzes standard autoregressive LLMs, identifying "reasoning" tokens as a dual-purpose mechanism: a computational buffer for latent processing and a semantic primer (factual priming) that retrieves inaccessible...	03-13 06:29	Success	-	View
exp_pytrain.20260313062640.084_20260313_062705 Paper: pytrain.20260313062640.084	Robust Typed CLI Utility with Protocol Abstraction This benchmark evaluates a Python script's adherence to strict packaging standards and advanced static typing. The candidate script, `benchmark.py`, implements a mock `SystemExporter` utility. It demonstrates robustness by defining a `Stora...	03-13 06:27	Success	-	View
exp_2303.17574v1_20260313_062548 Paper: 2303.17574v1	Benchmark: Expert Weight Removal (EWR) on Flan-T5 Architecture: EWR is a training method for Flan-T5 (Encoder-Decoder) models. It trains a "negative expert" on hallucinated responses and subtracts its weights from the base model, utilizing the Fisher Information Matrix to weigh...	03-13 06:26	Success	-	View
exp_2309.08960v1_20260313_062352 Paper: 2309.08960v1	Benchmark: ODSum Simulation (Retrieve-then-Summarize) Paper: ODSum: New Benchmarks for Open Domain Multi-Document Summarization Architecture: Standard retrieve-then-summarize pipeline. The paper proposes a rule-based method to convert query-based datasets into Open Domain Multi-Doc...	03-13 06:24	Success	-	View
exp_2309.08872v2_20260313_062257 Paper: 2309.08872v2	Benchmark: Structural RAG vs. Naive Chunking (Candidate 2309.08872v2) Architecture: A specialized RAG framework designed to handle document structure, routing queries to retrieve specific layout elements (tables, sections, pages) rather than treating the document as a flat text stream. **Retrieval Strateg...	03-13 06:23	Success	-	View
exp_2403.14258v1_20260313_062142 Paper: 2403.14258v1	Benchmark: Local TRIZ Contradiction Extraction (Llama 3 8B) Architecture: Shifts from fine-tuned BERT-style discriminative classifiers to generative Prompt Engineering using GPT-4 to extract complex TRIZ contradictions. Memory & Speed: The paper relies on API-based GPT-4, bypassing local...	03-13 06:22	Success	-	View
exp_pytrain.20260313061914.083_20260313_061949 Paper: pytrain.20260313061914.083	Dynamic Plugin Loader with Strict Type Validation This benchmark evaluates the implementation of a robust, type-safe plugin architecture using Python's standard library. Problem Statement The objective is to create a system where functionality (plugins) can be discovered and loaded dynamic...	03-13 06:19	Success	-	View
exp_cr_10.1093_llc_fqaf082_20260313_061742 Paper: cr_10.1093_llc_fqaf082	Backfill Candidate cr_10.1093_llc_fqaf082 Architecture: Fine-tuned CLIP (Contrastive Language-Image Pre-Training) model for cross-modal retrieval. Retrieval Strategy: Text-to-Image retrieval using visual feature embeddings (bypassing metadata). Indexing: Vector index of...	03-13 06:17	Success	-	View
exp_2512.14448v1_20260313_061701 Paper: 2512.14448v1	Backfill Candidate 2512.14448v1 This paper investigates Reasoning-Style Poisoning (RSP), targeting ReAct, Reflection, and Tree of Thoughts (ToT) agent architectures. It employs Generative Style Injection (GSI) to rewrite retrieved documents with pa...	03-13 06:17	Success	-	View
exp_cr_10.3390_electronics13183710_20260313_061614 Paper: cr_10.3390_electronics13183710	Backfill Candidate cr_10.3390_electronics13183710 Architecture: Hybrid model utilizing multi-scale frequency decomposition. High-frequency data is processed via a Temporal GNN with an Adaptive Graph Learning module, while low-frequency data uses a Bidirectional Temporal Network, fused...	03-13 06:16	Success	-	View
exp_cr_10.52783_jisem.v10i3.4744_20260313_061522 Paper: cr_10.52783_jisem.v10i3.4744	Backfill Candidate cr_10.52783_jisem.v10i3.4744 Architecture: The paper proposes a hybrid architecture combining an Enhanced Vision Transformer (EViT) with a Bidirectional LSTM (BiLSTM) for glaucoma detection. The EViT extracts global spatial features, while the BiLSTM processes sequ...	03-13 06:15	Success	-	View
exp_pytrain.20260313061210.082_20260313_061311 Paper: pytrain.20260313061210.082	Generic Plugin Loader with PEP 695 Overview This benchmark evaluates a coding agent's ability to utilize modern Python 3.12+ syntax (PEP 695 Type Parameter Syntax) to define generic classes, while simultaneously demonstrating robust packaging practices by dynamically creatin...	03-13 06:13	Success	-	View
exp_2506.16633v2_20260313_055245 Paper: 2506.16633v2	Benchmark for SightSense (GeoGuess) Architecture Paper: GeoGuess (SightSense) Summary for ARES 8GB Roadmap: * Architecture: Proposes SightSense, a multimodal framework processing Street View panoramas. It employs a hierarchical visual encoder to synthesize local de...	03-13 06:10	Success	-	View
exp_hf_2603.10101_20260313_055142 Paper: hf_2603.10101	Benchmark for CLIPO: Zero-Overhead RLVR Integration Architecture: CLIPO modifies the RLVR training pipeline by integrating a contrastive learning objective into policy optimization. Instead of relying solely on sparse, final-answer rewards, it optimizes the model to distinguish between r...	03-13 05:51	Success	-	View
exp_2303.16341v3_20260313_055028 Paper: 2303.16341v3	This benchmark simulates the S-ViLM (Structured Video-Language Modeling) architecture, specifically focusing on the... Paper: S-ViLM (Structured Video-Language Modeling) Architecture: S-ViLM utilizes a dual-stream Transformer (Video + Text). It deviates from global contrastive learning to implement inter-clip spatial grounding (aligning text to...	03-13 05:50	Success	-	View
exp_pytrain.20260313054716.081_20260313_054752 Paper: pytrain.20260313054716.081	Structural Subtyping Plugin Loader This benchmark validates a robust Python plugin architecture based on structural subtyping using `typing.Protocol`. Hypothesis Leveraging `typing.Protocol` combined with `importlib` enables the development of modular, extensible systems whe...	03-13 05:47	Success	-	View
exp_2403.12894v2_20260313_054537 Paper: 2403.12894v2	Backfill Candidate 2403.12894v2 Architecture: Tri-modal binding framework (CXR, ECG, Text) using text as a central anchor. It employs a dual-loss strategy: standard contrastive loss for modality-text pairs and a custom "Edge-Modality Contrastive Loss" to align dispara...	03-13 05:45	Success	-	View
exp_2409.13997v1_20260313_054414 Paper: 2409.13997v1	Backfill Candidate 2409.13997v1 Architecture: DriftNet utilizes a "representational drift" mechanism to navigate local loss landscape minima, dynamically retrieving relevant tasks to prevent catastrophic forgetting. It functions as a lifelong learning layer atop stand...	03-13 05:44	Success	-	View
exp_pytrain.20260313054024.080_20260313_054105 Paper: pytrain.20260313054024.080	Type-Safe Configuration Manager and Mock Plugin Registry This benchmark evaluates a Python developer's ability to construct a robust core system typical of high-performance machine learning frameworks (like PyTorch or Lightning AI). The challenge involves creating a strictly typed configuration s...	03-13 05:41	Success	-	View
exp_2409.14617v1_20260313_053831 Paper: 2409.14617v1	Backfill Candidate 2409.14617v1 Architecture: Protein-Mamba replaces standard attention mechanisms with Mamba State Space Models (SSMs). It employs a two-stage pipeline: self-supervised pre-training on chemical structures followed by supervised fine-tuning. This shift...	03-13 05:38	Success	-	View
exp_2409.14584v1_20260313_053703 Paper: 2409.14584v1	Benchmark for Hybrid Entity Typing System (Candidate 2409.14584v1) Assessment for ARES 8GB Roadmap: * Architecture: Hybrid system combining a fine-tuned Transformer-based text encoder (likely BERT/RoBERTa) with pre-computed network embeddings. Features a classification head over 136 semantic types....	03-13 05:37	Success	-	View
exp_2303.16769v1_20260313_053553 Paper: 2303.16769v1	Backfill Candidate 2303.16769v1 Architecture: Utilizes off-the-shelf Vision-Language Models (VLMs) like CLIP, introducing "Semantic Anchors" to fuse sketch features with textual semantic spaces. Trained via a novel Anchored Contrastive Loss to align sketch embeddings...	03-13 05:35	Success	-	View
exp_pytrain.20260313053245.079_20260313_053336 Paper: pytrain.20260313053245.079	Type-Safe Virtual Package Registry Overview This benchmark is designed to test an autonomous coding system's ability to simulate a complex package distribution and loading mechanism, akin to frameworks like Hugging Face Transformers or vLLM. The Challenge The candidate must...	03-13 05:33	Success	-	View
exp_2309.11206v2_20260313_052042 Paper: 2309.11206v2	Retrieve-Rewrite-Answer RAG Benchmark Architecture: Proposes a modular "Retrieve-Rewrite-Answer" RAG pipeline. Instead of injecting raw Knowledge Graph (KG) triples directly into the prompt, it inserts an intermediate generation step. This "Rewrite" stage converts graph tri...	03-13 05:30	Success	-	View
exp_pytrain.20260313051821.078_20260313_051857 Paper: pytrain.20260313051821.078	Dynamic Type-Safe Plugin Registry This benchmark evaluates a Python script's ability to dynamically construct a modular plugin architecture using `typing.Protocol` for structural subtyping and `importlib` for runtime introspection. Objective The script creates a strict `Dat...	03-13 05:19	Success	-	View
exp_2309.16816v1_20260313_051652 Paper: 2309.16816v1	PROSE: Physics-Informed Multimodal Transformers Architecture: PROSE utilizes a multimodal Transformer architecture with feature fusion to simultaneously map parametric inputs to both numerical solution operators and symbolic mathematical expressions. Memory Footprint: **High Risk...	03-13 05:16	Success	-	View
exp_2409.14607v2_20260313_051552 Paper: 2409.14607v2	Backfill Candidate 2409.14607v2 Architecture Proposes a "Patch Ranking" framework consisting of a lightweight predictor trained to approximate a greedy "Golden Ranking" of local patch tokens. The model prunes lower-ranked tokens and introduces learnable visual prompts...	03-13 05:15	Success	-	View
exp_2409.14572v2_20260313_051435 Paper: 2409.14572v2	Backfill Candidate 2409.14572v2 Summary: Evaluating LLMs in Materials Science This study evaluates standard LLM architectures (not novel ones) for materials science applications (Q&A and property prediction) using prompt engineering strategies like Chain-of-Thought an...	03-13 05:14	Success	-	View
exp_pytrain.20260313051151.077_20260313_051229 Paper: pytrain.20260313051151.077	Strict CLI Subcommand Dispatcher with Protocol-Based Registry Overview This benchmark evaluates the implementation of a lightweight, modular CLI tool using Python's standard library. It focuses on correct usage of `argparse` for subcommands and `typing.Protocol` for structural subtyping to ensure a pl...	03-13 05:12	Success	-	View
exp_cr_10.2196_67967_20260313_051002 Paper: cr_10.2196_67967	Backfill Candidate cr_10.2196_67967 Architecture: The study evaluates a fine-tuned `scispaCy` model against two domain-specific LLMs: NYUTron (110M parameters) and GatorTron (345M parameters). Both are highly optimized "tiny" architectures suitable for clinical NL...	03-13 05:10	Success	-	View
exp_2506.16650v1_20260313_050904 Paper: 2506.16650v1	Backfill Candidate 2506.16650v1 Architecture: Proposes a complex, multi-stage agentic workflow. It moves beyond simple code localization by integrating execution semantics for context retrieval and generalized abstraction for issue understanding. The core uses...	03-13 05:09	Success	-	View
exp_2506.16586v1_20260313_050732 Paper: 2506.16586v1	Benchmark: AI-Agent QA Workflow Simulation (Target: ARES 8GB Roadmap) Assessment: This paper evaluates a workflow rather than a specific model architecture. It focuses on applying generic "state-of-the-art" LLMs to QA tasks. * Architecture: Utilizes AI-agents for automated test case generation, stat...	03-13 05:07	Success	-	View
exp_pytrain.20260313050510.076_20260313_050537 Paper: pytrain.20260313050510.076	Python Skill Fallback Title: Runtime Type-Safe Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-13 05:05	Success	-	View
exp_2512.14896v1_20260313_050256 Paper: 2512.14896v1	DrugRAG Efficiency Benchmark Architecture DrugRAG is a model-agnostic, three-step Retrieval-Augmented Generation (RAG) pipeline. It functions as an external wrapper, retrieving structured drug knowledge to augment prompts without modifying the underlying LLM archit...	03-13 05:03	Success	-	View
exp_hf_2603.10165_20260313_050147 Paper: hf_2603.10165	Benchmark: OpenClaw-RL Policy Deployment Architecture: OpenClaw-RL utilizes an asynchronous pipeline decoupling three components: the live serving policy, a Process Reward Model (PRM) for evaluative signals, and a Hindsight-Guided On-Policy Distillation (OPD) trainer for direc...	03-13 05:02	Success	-	View
exp_hf_2603.08068_20260313_050056 Paper: hf_2603.08068	ICRL: Iterative Curriculum Reinforcement Learning Architecture: ICRL is a training methodology, not a novel inference architecture. It replaces standard SFT+RL pipelines with an RL-only approach, utilizing a "curriculum" where the model learns tool use via in-context examples that are...	03-13 05:01	Success	-	View
exp_pytrain.20260313045743.075_20260313_045827 Paper: pytrain.20260313045743.075	Type-Safe Dynamic Extension Loader Overview This coding drill validates the hypothesis that combining `typing.Protocol` with runtime `importlib` introspection enables the creation of robust, self-verifying plugin architectures. By defining explicit generic interfaces (Protoc...	03-13 04:58	Success	-	View
exp_oa_W4377820925_20260313_045615 Paper: oa_W4377820925	Backfill Candidate oa_W4377820925 Paper Type: General Taxonomy / Survey (Not a specific model architecture). Summary: This text outlines standard NLP workloads rather than a novel architecture. It defines Autoregressive Language Models as the core for text gener...	03-13 04:56	Success	-	View
exp_cr_10.1609_aaai.v37i4.25603_20260313_045523 Paper: cr_10.1609_aaai.v37i4.25603	Backfill Candidate cr_10.1609_aaai.v37i4.25603 Architecture: Dense Retrieval (Contrastive Dual-Encoder). Retrieval Strategy: Unsupervised training via "Approximate Aggregated Positive," aggregating same-case evidence to serve as positive examples for queries. **Indexing/Chunking...	03-13 04:55	Success	-	View
exp_2309.10506v1_20260313_045432 Paper: 2309.10506v1	Table Retrieval Benchmark (Dual-Encoder Structural Aggregation) Architecture: Proposes a dual-encoder dense retrieval framework. It decouples the processing of queries (syntactic representation) and tables (structural representation of headers and values), utilizing a specific "syntactical-to-struct...	03-13 04:54	Success	-	View
exp_cr_10.1609_aaai.v38i8.28779_20260313_045334 Paper: cr_10.1609_aaai.v38i8.28779	Benchmark: TriSampler Enabled Compact Dense Retrieval Classification: Training Optimization (Inference Architecture Agnostic). Architecture & Retrieval: Enhances standard Dense Retrieval (Bi-Encoder) models via a "quasi-triangular" negative sampling principle. It optimizes training...	03-13 04:53	Success	-	View
exp_pytrain.20260313045017.074_20260313_045121 Paper: pytrain.20260313045017.074	Type-Safe Plugin Registry with Semantic Versioning This benchmark tests the implementation of a robust, type-driven plugin architecture using Python's standard library. It simulates a subset of a package manager's core logic, leveraging advanced typing constructs like `Protocols`, `Generics...	03-13 04:51	Success	-	View
exp_cr_10.1142_s0129156425409179_20260313_043305 Paper: cr_10.1142_s0129156425409179	README: Vision Transformer Benchmark (Swin vs ViT) Architecture: Dual-model vision framework utilizing Vision Transformers (ViT) and Swin Transformers for feature extraction, coupled with a spatial indexing strategy for rapid image retrieval. Retrieval Strategy: * **Retrieval Archit...	03-13 04:48	Success	-	View
exp_pytrain.20260313042923.073_20260313_042951 Paper: pytrain.20260313042923.073	README: Typed Plugin Architecture Benchmark This benchmark evaluates a Python system's capability to dynamically construct a strictly typed namespace package at runtime. The test simulates a plugin architecture where a core interface (`Protocol`) is defined in a base module, implemen...	03-13 04:30	Success	-	View
exp_cr_10.1609_aaai.v38i16.29755_20260313_042619 Paper: cr_10.1609_aaai.v38i16.29755	Benchmark: Soft-Prompt Augmented Dense Retrieval Architecture: Standard Dense Retrieval (Bi-Encoder) augmented with learnable soft tokens prepended to inputs. These tokens explicitly decouple domain-specific knowledge and supervision signals, enabling zero-shot adaptation without...	03-13 04:27	Success	-	View
exp_2506.16552v3_20260313_042452 Paper: 2506.16552v3	Backfill Candidate 2506.16552v3 Architecture: Revela employs a standard dense dual-encoder architecture (Bi-Encoder). It integrates retriever optimization into Language Modeling (LM) training by using retriever-computed similarity scores to weight an in-batch cross-do...	03-13 04:24	Success	-	View
exp_pytrain.20260313042055.072_20260313_042139 Paper: pytrain.20260313042055.072	Strict Dataclass Mapper Implementation This benchmark defines a robust, recursive object mapper (`hydrate`) using only the Python standard library. It validates primitive types, handles nested `dataclass` instances, and manages `Optional` fields. Usage The module exposes two pub...	03-13 04:21	Success	-	View
exp_2512.14870v1_20260313_041812 Paper: 2512.14870v1	HERBench Memory & Fusion Benchmark HERBench introduces a high-complexity VideoQA benchmark requiring the aggregation of at least three temporally separated visual cues. It utilizes a Minimum Required Frame-Set (MRFS) metric averaging 5.5 frames, significantly higher than...	03-13 04:18	Success	-	View
exp_hf_2603.08754_20260313_041613 Paper: hf_2603.08754	HCAPO "Hindsight Critique" Performance Benchmark Architecture: HCAPO modifies the Group Relative Policy Optimization (GRPO) framework by repurposing the LLM as a post-hoc critic. It introduces a multi-scale advantage mechanism to refine step-level Q-values and correct misaligned basel...	03-13 04:16	Success	-	View
exp_pytrain.20260313041221.071_20260313_041334 Paper: pytrain.20260313041221.071	Python Skill Fallback Title: Structural Typing for CLI Plugin Architecture - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-13 04:13	Success	-	View
exp_2303.10395v1_20260313_040931 Paper: 2303.10395v1	Graph-Guided Retrieval-Augmented Generation (RAG) Benchmark Architecture: A Graph-Guided Retrieval-Augmented Generation (RAG) framework. It retrieves supporting facts from a textual knowledge base, converts them into a question-specific open Knowledge Graph (KG), and performs sequential reasonin...	03-13 04:09	Success	-	View
exp_2309.12294v1_20260313_040736 Paper: 2309.12294v1	Logical Form (LF) to Text: Dual-Stage Generate-and-Rerank Benchmark Architecture: Proposes a dual-stage Generate-and-Rerank pipeline for Logical Form (LF) to text. A generator LLM creates $N$ diverse candidates, which a task-specific discriminative reranker scores based on semantic alignment and hum...	03-13 04:07	Success	-	View
exp_pytrain.20260313040323.070_20260313_040441 Paper: pytrain.20260313040323.070	Robust Async Plugin Dispatcher Benchmark Overview This benchmark evaluates a Python-based mini-framework designed for dynamically discovering and executing asynchronous tasks. It emphasizes strict type adherence using `typing.Protocol` and explicit namespace management via `__all_...	03-13 04:04	Success	-	View
exp_2403.17359v2_20260313_040042 Paper: 2403.17359v2	Backfill Candidate 2403.17359v2 Architecture & RAG Specifics: Chain-of-Action (CoA) is an agentic RAG framework utilizing a reasoning-retrieval loop. It decomposes queries into "Plug-and-Play" actions to fetch heterogeneous multimodal data. * Retrieval: Iterat...	03-13 04:00	Success	-	View
exp_2512.13164v2_20260313_035850 Paper: 2512.13164v2	Backfill Candidate 2512.13164v2 Architecture: CRAFTS is a Latent Diffusion Model (LDM) utilizing a dual-stage "Correlation-Regulated Alignment Framework" to minimize semantic drift. It integrates ControlNet for spatial conditioning via segmentation masks. **Memory Foo...	03-13 03:58	Success	-	View
exp_2403.18058v2_20260313_035737 Paper: 2403.18058v2	Backfill Candidate 2403.18058v2 Architecture: N/A (Data-centric). This paper introduces a high-quality Chinese instruction tuning dataset (COIG-CQIA) derived from real-world sources. It is designed to fine-tune existing open-source architectures (e.g., LLaMA, Baichuan...	03-13 03:57	Success	-	View
exp_pytrain.20260313035409.069_20260313_035448 Paper: pytrain.20260313035409.069	Dynamic Type-Safe Plugin Registry Benchmark This benchmark tests the ability to construct a robust, type-safe plugin architecture using Python's standard library. It evaluates the implementation of dynamic module loading, runtime type checking using `typing.Protocol`, and filesystem...	03-13 03:54	Success	-	View
exp_cr_10.3390_math12182941_20260313_035154 Paper: cr_10.3390_math12182941	Backfill Candidate cr_10.3390_math12182941 Architecture: Proposes a weighted-average ensemble of five heterogeneous Arabic Transformers (AraBERT, MARBERT, AraELECTRA, AraGPT2, ARBERT). Memory Footprint: Critical Bottleneck. Concurrently loading five distinct encoder/deco...	03-13 03:51	Success	-	View
exp_2506.16623v1_20260313_035042 Paper: 2506.16623v1	Backfill Candidate 2506.16623v1 Architecture The framework utilizes a frontier-based exploration strategy guided by a Vision-Language Model (VLM). Instead of simple embedding similarity, it employs dynamic history-augmented prompting. The system injects a text...	03-13 03:50	Success	-	View
exp_pytrain.20260313034627.068_20260313_034739 Paper: pytrain.20260313034627.068	Robust Dynamic Plugin Loader Benchmark This benchmark evaluates a Python implementation of a robust plugin architecture using `importlib` for dynamic discovery and `typing.Protocol` for structural subtyping. Objective Create a system that: 1. Dynamically generates a temporary en...	03-13 03:47	Success	-	View
exp_oa_W4415248384_20260313_034239 Paper: oa_W4415248384	Benchmark: Transformer vs. Mamba (SSM) Efficiency on 8GB Constraints Subject: Analysis of A Comprehensive Survey of Large AI Models for Future Communications This survey evaluates Large AI Models (LAMs) for 6G, reviewing Transformers, Diffusion, and Mamba architectures. Key takeaways for the ARES 8...	03-13 03:42	Success	-	View
exp_hf_2603.09877_20260313_034113 Paper: hf_2603.09877	Benchmark for InternVL-U Architecture Simulation Architecture: InternVL-U utilizes a hybrid "decoupled" architecture, merging a Multimodal Large Language Model (MLLM) for understanding/reasoning with a specialized Multimodal Diffusion Transformer (MMDiT) head for visual generation and...	03-13 03:41	Success	-	View
exp_2309.14735v2_20260313_034009 Paper: 2309.14735v2	Backfill Candidate 2309.14735v2 Paper Classification: Comparative Survey / Evaluation (Not a new architecture proposal). * Architecture: Benchmarks existing "AILQA paradigms" against OpenAI GPT (API-based baseline). No specific local model architecture (e.g., Enco...	03-13 03:40	Success	-	View
exp_pytrain.20260313033558.067_20260313_033723 Paper: pytrain.20260313033558.067	Generic Data Pipeline with Protocol Registration This benchmark evaluates an autonomous coding system's ability to architect a modular, type-safe data processing pipeline using Python's advanced `typing` features (`Protocol`, `Generic`, `TypeVar`) and packaging standards (`__all__`). Obje...	03-13 03:37	Success	-	View
exp_2309.09070v1_20260313_033309 Paper: 2309.09070v1	Legal QA Hybrid Retrieval Benchmark (L2R + PLM) Architecture: Hybrid system combining classical statistical models and Pre-trained Language Models (PLMs) for legal domain QA. Retrieval Architecture: Employs a Learning-to-Rank (L2R) approach to consolidate features from variou...	03-13 03:33	Success	-	View
exp_2309.08187v1_20260313_033118 Paper: 2309.08187v1	Benchmark: Hybrid Retrieval with Encoded Summarization (2309.08187v1) Architecture: Hybrid retrieval system combining lexical (sparse) and latent (dense) features via a deep neural phrase-scoring framework. Retrieval Strategy: Encoded Summarization. The method compresses full legal documents into...	03-13 03:31	Success	-	View
exp_pytrain.20260313032651.066_20260313_032756 Paper: pytrain.20260313032651.066	Strictly Typed Plugin System with Semantic Versioning Overview This benchmark validates the hypothesis that enforcing structural sub-typing using `typing.Protocol` and runtime `inspect` validation creates a more robust plugin architecture than implicit duck-typing. The `ComponentRegistry` dyna...	03-13 03:27	Success	-	View
exp_2403.16702v1_20260313_031453 Paper: 2403.16702v1	Bi-Encoder Code Search Benchmark (Dual-Encoder) Architecture & Feasibility: The paper proposes a Dual-Encoder (Bi-Encoder) architecture using modality-agnostic contrastive pre-training to align natural language queries with code representations. This is highly feasible for 8GB VR...	03-13 03:25	Success	-	View
exp_pytrain.20260313031141.065_20260313_031222 Paper: pytrain.20260313031141.065	Dynamic Type-Safe Plugin Loader Benchmark This coding drill evaluates the ability to implement a robust, type-safe plugin system using only the Python standard library. The focus is on dynamic module generation, structural subtyping (Protocols), and generic type safety. Features -...	03-13 03:12	Success	-	View
exp_2409.09010v1_20260313_030945 Paper: 2409.09010v1	Backfill Candidate 2409.09010v1 Architecture: Hybrid Graph-Text RAG pipeline (Retrieve-then-Read). Retrieval Architecture: Dual-source extraction combining structured Knowledge Graphs (DBLP, SemOpenAlex) and unstructured text (Wikipedia). Indexing/Chunking: Ab...	03-13 03:09	Success	-	View
exp_2512.13511v1_20260313_030733 Paper: 2512.13511v1	TARA: Dual-Encoder Video-Text Retrieval Benchmark Architecture: TARA adapts frozen MLLMs (e.g., LLaVA) into video-text embedding models by adding a trainable projection layer. It is trained exclusively on synthetic caption data, eliminating the need for real video datasets. **Retrieval...	03-13 03:07	Success	-	View
exp_pytrain.20260313030319.064_20260313_030441 Paper: pytrain.20260313030319.064	Strictly-Typed Data Pipeline CLI Benchmark Overview This benchmark defines a coding drill focused on Strict Typing and Interface Segregation using Python's `typing.Protocol` and `argparse`. The goal is to implement a text processing pipeline where components adhere to a stri...	03-13 03:04	Success	-	View
exp_2512.13001v1_20260313_030054 Paper: 2512.13001v1	Backfill Candidate 2512.13001v1 This paper validates the superiority of Text Embedding Models (TEMs) over Large Language Models (LLMs) for training-free cold-start recommendation (TFCSR). * Architecture: Benchmarks a TEM-based retrieval approach (bi-encoder ve...	03-13 03:00	Success	-	View
exp_pytrain.20260313025506.063_20260313_025618 Paper: pytrain.20260313025506.063	Structural Subtyping Dispatcher Benchmark Objective This benchmark evaluates the implementation of a robust CLI dispatcher using Python's `typing.Protocol` for structural subtyping. The architecture ensures that the core dispatcher remains agnostic to concrete command implementatio...	03-13 02:56	Success	-	View
exp_2512.14856v2_20260313_025206 Paper: 2512.14856v2	Backfill Candidate 2512.14856v2 Architecture: T5Gemma 2 repurposes the decoder-only Gemma 3 into an encoder-decoder architecture via UL2 adaptation, specifically optimized for multimodal and long-context tasks. Memory Footprint: The model prioritizes VRAM effi...	03-13 02:52	Success	-	View
exp_cr_10.24252_literatify.v5i1.44458_20260313_025015 Paper: cr_10.24252_literatify.v5i1.44458	Vector Space Model (VSM) Benchmark Report: Literature Review on Vector Space Models (VSM) Type: Literature Review (Traditional Information Retrieval) Relevance: Low (Non-Neural), but applicable to RAG preprocessing. * Architecture: Analyzes the classic **Vect...	03-13 02:50	Success	-	View
exp_pytrain.20260313024535.062_20260313_024713 Paper: pytrain.20260313024535.062	Modern Generic Cache with PEP 695 and Module Hygiene Objective This coding drill validates the implementation of a modern, thread-safe Least Recently Used (LRU) Cache utilizing PEP 695 Type Parameter Syntax (Python 3.12+) and strict module packaging standards. Key Concepts * **PEP 695 (Ty...	03-13 02:47	Success	-	View
exp_2403.18093v1_20260313_024223 Paper: 2403.18093v1	Benchmark: 3-Stage Retrieval-Augmented Generation (RAG) Pipeline Architecture: A sequential 3-stage pipeline: Sparse Retrieval (BM25) $\rightarrow$ Neural Re-ranking (BERT) $\rightarrow$ Generative Retrieval (LLM Prompting). Memory Footprint: Mixed. The BM25 and BERT stages are low-VRAM and feasi...	03-13 02:44	Success	-	View
exp_pytrain.20260313023843.061_20260313_023939 Paper: pytrain.20260313023843.061	Dynamic Plugin Loader with Protocol Validation Overview This coding drill demonstrates the use of Python's `importlib` and `typing.Protocol` to build a robust, dynamic plugin system. Objective Construct a command-line script that acts as a plugin loader: 1. Define Protocol: Use `typ...	03-13 02:39	Success	-	View
exp_hf_2603.08561_20260313_022704 Paper: hf_2603.08561	RetroAgent Context-Memory Benchmark Architecture: RetroAgent introduces an online RL framework utilizing "hindsight self-reflection" to generate dual intrinsic feedback: numerical rewards for tracking exploration and linguistic lessons stored in an explicit memory buffer....	03-13 02:37	Success	-	View
exp_2403.16218v4_20260313_022530 Paper: 2403.16218v4	This benchmark evaluates the efficacy of the "Coverage-Guided Iterative Generation" architecture described in the subjec... Architecture: Iterative "Test-Analyze-Refine" loop. Uses a standard LLM coupled with a Python interpreter and coverage analyzer (e.g., `coverage.py`). It generates tests, executes them to identify uncovered lines/branches, and feeds the...	03-13 02:25	Success	-	View
exp_2403.13468v1_20260313_022442 Paper: 2403.13468v1	Backfill Candidate 2403.13468v1 Architecture: Uses a Mixture-of-Experts (MoE) framework comprising a neural gating network (trained on Wikipedia) and multiple specialized domain experts. Retrieval Architecture: Dense Bi-Encoder retrieval. The gating mechanism clas...	03-13 02:24	Success	-	View
exp_pytrain.20260313022129.060_20260313_022242 Paper: pytrain.20260313022129.060	Runtime Type-Safe Plugin Loader Benchmark This benchmark tests the ability to construct a robust, type-safe plugin system using Python's standard library, mirroring the module discovery and registration patterns found in large-scale frameworks like PyTorch or LitGPT. Objective Crea...	03-13 02:22	Success	-	View
exp_2409.09717v1_20260313_020953 Paper: 2409.09717v1	This benchmark focuses on the core bottleneck identified in the abstract: the multi-turn latency introduced by the "Expe... Architecture: Embodied agent framework utilizing function-calling to interface with ATC simulators, augmented by a retrieval mechanism. Retrieval Architecture: "Experience Library" (Vector DB). Strategy: Stores synthesized knowl...	03-13 02:19	Success	-	View
exp_2403.18105v2_20260313_020848 Paper: 2403.18105v2	README: Educational LLM Tutoring Benchmark Assessment: Low Technical Relevance for ARES 8GB Roadmap * Architecture: N/A. This is a survey paper reviewing existing educational applications (tutoring, adaptive learning) and datasets. It does not propose a new model architectur...	03-13 02:09	Success	-	View
exp_2403.18063v2_20260313_020737 Paper: 2403.18063v2	Heracles: High-Resolution Vision Model Benchmark Architecture Heracles is a hybrid model combining a local SSM (using localized convolutions), a global SSM (leveraging a Hartley kernel), and an attention-based token interaction module. This design mitigates the instability of pure SSM...	03-13 02:07	Success	-	View
exp_pytrain.20260313020419.059_20260313_020457 Paper: pytrain.20260313020419.059	Typed Plugin Registry with Protocol Enforcement This coding drill benchmarks a robust, dependency-injection style registry system built entirely with Python's standard library. It leverages structural sub-typing via `typing.Protocol` and Generics (`typing.TypeVar`) to ensure type safety...	03-13 02:05	Success	-	View
exp_2303.16780v1_20260313_020242 Paper: 2303.16780v1	Thistle VDB Benchmark Architecture & Retrieval Strategy: Thistle is a Rust-based vector database designed for high-performance, local semantic search. It functions as the retrieval backbone for RAG systems, utilizing standard Approximate Nearest Neighbor...	03-13 02:02	Success	-	View
exp_2303.16780v1_20260313_020126 Paper: 2303.16780v1	Benchmark: Thistle Rust-Based VDB Integration Architecture & Retrieval Strategy: Thistle is a Rust-based vector database designed for high-performance, local semantic search. It functions as the retrieval backbone for RAG systems, utilizing standard Approximate Nearest Neighbor...	03-13 02:01	Success	-	View
exp_2309.12158v1_20260313_020019 Paper: 2309.12158v1	Benchmark: Cross-Modal Audio-Sheet Music Retrieval (SSM Dual-Encoder) Paper Type: Survey/Review on Cross-Modal Retrieval. Architecture: The paper evaluates Cross-Modal Deep Learning architectures, specifically Dual-Encoders (Siamese networks) that learn a Joint Embedding Space to link audi...	03-13 02:00	Success	-	View
exp_pytrain.20260313015742.058_20260313_015822 Paper: pytrain.20260313015742.058	Type-Safe Plugin Architecture Benchmark This project implements a robust, type-safe plugin architecture using Python's `typing.Protocol` and Generics. It demonstrates structural subtyping (duck typing with static type hints) to enforce interface contracts without explicit inherit...	03-13 01:58	Success	-	View
exp_2309.11087v6_20260313_015600 Paper: 2309.11087v6	Backfill Candidate 2309.11087v6 Architecture: Reference-Free DNA Transformer encoder utilizing contrastive loss to project reads and reference fragments into a shared vector space. Retrieval Strategy (RAG-oriented): * Architecture: Approximate Nearest Neighbor...	03-13 01:56	Success	-	View
exp_2403.12393v1_20260313_015437 Paper: 2403.12393v1	Backfill Candidate 2403.12393v1 Architecture: Dr3 is an inference wrapper, not a standalone model. It adds a Discriminator module to detect off-topic answers and a Corrector loop that refines outputs backward (Re-Compose $\rightarrow$ Re-Solve $\rightarrow$ Re...	03-13 01:54	Success	-	View
exp_2409.12959v2_20260313_015323 Paper: 2409.12959v2	Benchmark: MMSearch-Engine Pipeline (Candidate 2409.12959v2) Assessment: The paper introduces `MMSearch-Engine`, a retrieval-augmented generation (RAG) pipeline designed to empower Large Multimodal Models (LMMs) with search capabilities, plus the `MMSearch` benchmark. * **Architecture & RAG Strat...	03-13 01:53	Success	-	View
exp_2409.08788v1_20260313_015243 Paper: 2409.08788v1	Backfill Candidate 2409.08788v1 Architecture: A dual-stage pipeline consisting of a self-supervised ECG encoder (generating fixed-dimensional embeddings from raw time-series data) coupled with an off-the-shelf LLM for report synthesis and QA. RAG Strategy: * **Ret...	03-13 01:52	Success	-	View
exp_pytrain.20260313014950.057_20260313_015042 Paper: pytrain.20260313014950.057	Dynamic Package Construction and Type Verification Overview This benchmark evaluates an agent's ability to programmatically generate a valid Python package structure, write strictly typed Python code into it, and subsequently verify the structure and type correctness using reflection and dy...	03-13 01:50	Success	-	View
exp_2403.18128v1_20260313_014814 Paper: 2403.18128v1	Backfill Candidate 2403.18128v1 Architecture: HealthGAT utilizes a hierarchical Graph Attention Network (GAT) architecture. It transforms raw Electronic Health Records (EHR) into a graph structure, employing iterative refinement layers to update medical code embedding...	03-13 01:48	Success	-	View
exp_2409.14556v2_20260313_014724 Paper: 2409.14556v2	Backfill Candidate 2409.14556v2 Architecture: RACOON utilizes a Retrieval-Augmented Generation (RAG) pipeline, substituting standard vector retrieval with Knowledge Graph (KG) querying. It dynamically retrieves semantic context and constraints from the KG to augment t...	03-13 01:47	Success	-	View
exp_hf_2603.04597_20260313_014616 Paper: hf_2603.04597	Benchmark: GOLF (Group-level Natural Language Feedback) Paper Analysis: GOLF (Group-level Natural Language Feedback) Architecture: GOLF introduces a unified RL framework that moves beyond scalar rewards by leveraging group-level natural language feedback. It aggregates two distinct sourc...	03-13 01:46	Success	-	View
exp_2409.13920v1_20260313_014525 Paper: 2409.13920v1	Backfill Candidate 2409.13920v1 Architecture: ByT5 (Byte-level Text-to-Text Transfer Transformer). An encoder-decoder model fine-tuned for Sanskrit morphology (segmentation, lemmatization, POS tagging). It processes raw bytes, eliminating the need for tokenizers and h...	03-13 01:45	Success	-	View
exp_pytrain.20260313014313.056_20260313_014339 Paper: pytrain.20260313014313.056	Dynamic Plugin Loader with Strict Protocol Validation Overview This benchmark evaluates a system's capability to dynamically construct a Python package ecosystem at runtime, load modules via `importlib`, and enforce strict structural typing using `typing.Protocol`. Objective The `PluginManager...	03-13 01:43	Success	-	View
exp_2506.15594v1_20260313_014100 Paper: 2506.15594v1	Backfill Candidate 2506.15594v1 WikiMixQA is a benchmark evaluating Visual RAG capabilities, comprising 1,000 multimodal questions over tables and charts from 4,000 long Wikipedia pages. * Retrieval Architecture: The benchmark evaluates models in a "Retrie...	03-13 01:41	Success	-	View
exp_2303.12998v1_20260313_013906 Paper: 2303.12998v1	This benchmark evaluates the local feasibility of the candidate "Universal NFT Vector Database" (2303.12998v1). The orig... Architecture: Modular, cloud-centered framework utilizing vector embeddings to represent NFTs (ERC-721) for similarity matching and duplicate detection. Retrieval Specifics: * Architecture: Universal NFT Vector Database. * **Ind...	03-13 01:39	Success	-	View
exp_pytrain.20260313013540.055_20260313_013627 Paper: pytrain.20260313013540.055	Generic Type-Safe Configuration Store This benchmark evaluates the implementation of a generic, type-safe configuration store using modern Python 3.12+ features. Features * PEP 695 Support: Uses the new type parameter syntax `class ConfigStore[T]:` for cleaner, more maintai...	03-13 01:36	Success	-	View
exp_cr_10.1609_aaai.v38i20.30232_20260313_013156 Paper: cr_10.1609_aaai.v38i20.30232	RAG Legal QA Benchmark (8GB VRAM Constraint) Architecture: An end-to-end RAG ("retrieve-then-read") pipeline designed for long-form French legal QA, utilizing the LLeQA dataset. Retrieval Strategy: The system retrieves "pertinent legal provisions" (statutory text) to groun...	03-13 01:33	Success	-	View
exp_cr_10.3390_app14062613_20260313_013050 Paper: cr_10.3390_app14062613	Sparse RAG Pipeline: CPU-Bound Lucene Simulation Architecture: Sparse RAG pipeline utilizing Apache Lucene for indexing 26.5M PubMed articles. Retrieval & Chunking: Employs Query Likelihood with Dirichlet Smoothing (outperforming BM25) on full-text documents. **Reranking & Citatio...	03-13 01:30	Success	-	View
exp_pytrain.20260313012805.054_20260313_012837 Paper: pytrain.20260313012805.054	Strictly Typed PyProject Metadata Builder This benchmark evaluates a Python engineer's ability to utilize advanced static typing constructs to define robust data structures for packaging configurations. Overview Python's dynamic nature allows for flexibility, but in complex systems...	03-13 01:28	Success	-	View
exp_cr_10.1167_tvst.14.9.18_20260313_012602 Paper: cr_10.1167_tvst.14.9.18	Ophthalmology RAG Benchmark Paper Summary: Advancing Question-Answering in Ophthalmology This study benchmarks open-source LLMs (Llama 2, Mistral) against proprietary models (GPT-3.5/4) within a Retrieval-Augmented Generation (RAG) framework for ophthalmology. * *...	03-13 01:26	Success	-	View
exp_2506.12733v1_20260313_012447 Paper: 2506.12733v1	Learning to Fuse: Modality-Aware Adaptive Scheduling (MA-AFS) Architecture: MA-AFS introduces a lightweight neural scheduler that dynamically modulates fusion weights for multimodal encoders (e.g., CLIP, BLIP). It predicts instance-specific weights based on visual/textual entropy and cross-modal a...	03-13 01:24	Success	-	View
exp_cr_10.1128_jcm.01624-24_20260313_012354 Paper: cr_10.1128_jcm.01624-24	Retrieval-augmented generation salvages poor performance from large language models in answering microbiology-specific m... Assessment: This paper validates the core 8GB VRAM hypothesis: Domain-specific RAG enables a 7B model (Llama-2) to significantly outperform GPT-4. It demonstrates that retrieval quality is more critical than parameter count for specia...	03-13 01:23	Success	-	View
exp_pytrain.20260313012118.053_20260313_012158 Paper: pytrain.20260313012118.053	Dynamic Type-Validated Plugin Registry Overview This benchmark tests the ability to design a robust, type-safe plugin architecture using Python's standard library. The objective is to simulate an environment where "plugins" are dynamically created as isolated modules, discovered...	03-13 01:22	Success	-	View
exp_2409.13483v1_20260313_011922 Paper: 2409.13483v1	Speech-Based Open-Domain QA Benchmark This paper proposes an ASR-free Multimodal Dense Retriever for spoken open-domain QA, bypassing the error-prone ASR transcription step. Architecture: Utilizes a Dual-Encoder setup: a frozen speech encoder (e.g., wav2vec 2.0) and...	03-13 01:19	Success	-	View
exp_2403.11335v1_20260313_011742 Paper: 2403.11335v1	ConvSDG: Session Data Generation for Conversational Search ConvSDG is a data-centric training framework utilizing offline LLMs to generate synthetic multi-turn sessions, thereby improving Conversational Dense Retrievers (Bi-encoders). * Retrieval Architecture: Dense Bi-encoder (Query-Do...	03-13 01:17	Success	-	View
exp_2403.11671v1_20260313_011644 Paper: 2403.11671v1	HDLdebugger: Streamlining HDL debugging with Large Language Models Architecture: HDLdebugger is a retrieval-augmented framework designed for Hardware Description Language (HDL) debugging. It integrates a reverse-engineering data generator, a search engine for context retrieval, and a fine-tuned Large L...	03-13 01:16	Success	-	View
exp_pytrain.20260313011353.052_20260313_011413 Paper: pytrain.20260313011353.052	Type-Safe Plugin Loader for Inference Models Overview This coding drill challenges you to construct a robust, framework-agnostic model loading system in Python. The goal is to implement a `ModelRegistry` that enforces strict contracts on "inference plugins" without requiring them to i...	03-13 01:14	Success	-	View
exp_2403.17611v1_20260313_011211 Paper: 2403.17611v1	DoTTeR Benchmark: Table-Text Retrieval Evaluation Architecture: DoTTeR utilizes a dense retrieval framework augmented with a specialized Rank-Aware Column Encoder. It employs a false-positive detection model (during training) to denoise data and integrates table-level ranking i...	03-13 01:12	Success	-	View
exp_2309.08469v2_20260313_011116 Paper: 2309.08469v2	Silver Retriever Benchmark Architecture: Silver Retriever utilizes a Dense Bi-Encoder architecture (query and passage encoded independently) based on a Polish BERT variant (likely HerBERT or similar), optimized for semantic vector matching. **Memory & Inferen...	03-13 01:11	Success	-	View
exp_2309.08788v2_20260313_011037 Paper: 2309.08788v2	BioinspiredLLM Benchmarking Suite Architecture & Feasibility: BioinspiredLLM is an open-source autoregressive transformer fine-tuned on a corpus of ~1,000 peer-reviewed articles. Critical Gap: The abstract does not specify the base model parameter count (e.g., 7B vs...	03-13 01:10	Success	-	View
exp_pytrain.20260313010650.051_20260313_010722 Paper: pytrain.20260313010650.051	Stdlib ZipApp Builder with Protocol Enforcement Overview This benchmark tests the ability to programmatically construct a Python application using only the standard library. The task involves generating a virtual filesystem, enforcing a `typing.Protocol` interface for a data processing a...	03-13 01:07	Success	-	View
exp_2512.14944v1_20260313_010442 Paper: 2512.14944v1	Puzzle Curriculum GRPO (PC-GRPO) Benchmark Architecture & Methodology PC-GRPO is a post-training reinforcement learning algorithm for VLMs (tested on Qwen-3B/7B). It eliminates external verifiers by using self-supervised "puzzle" environments (PatchFit, Rotation, Jigsaw) to gene...	03-13 01:04	Success	-	View
exp_2512.11490v1_20260313_010337 Paper: 2512.11490v1	VLM2GeoVec: Toward Universal Multimodal Embeddings for Remote Sensing Architecture: Single-encoder Vision-Language Model (VLM) trained contrastively to embed interleaved inputs (images, text, bounding boxes, coordinates) into a unified vector space. Retrieval Architecture: **Single-encoder contrastive...	03-13 01:03	Success	-	View
exp_2512.12818v1_20260313_010251 Paper: 2512.12818v1	Hindsight: Agent Memory Benchmark Architecture: Hindsight replaces standard vector retrieval with a structured "first-class" substrate comprising four logical networks (world facts, agent experiences, entity summaries, beliefs) and a recursive "reflection" layer that up...	03-13 01:03	Success	-	View
exp_pytrain.20260313005911.050_20260313_010003 Paper: pytrain.20260313005911.050	Python Skill Fallback Title: Typed Asynchronous Data Ingestion Framework - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-13 01:00	Success	-	View
exp_2506.14429v3_20260313_005718 Paper: 2506.14429v3	LongLLaDA Benchmark Paper: LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs Architecture: Utilizes Diffusion LLMs (LLaDA) enhanced with NTK-aware interpolation (RoPE scaling) for context extrapolation. Memory Footprint: **High Poten...	03-13 00:57	Success	-	View
exp_2506.15925v1_20260313_005605 Paper: 2506.15925v1	This benchmark evaluates the "Reranking-based Generation" concept. It compares a standard Zero-Shot generation baseline... Architecture: This paper proposes a Reranking-based Generation pipeline. It diverges from single-pass inference by first generating multiple summary candidates (e.g., via zero-shot sampling) and then employing a separate **LLM-based...	03-13 00:56	Success	-	View
exp_cr_10.69978_rebicte.v11i.210_20260313_005456 Paper: cr_10.69978_rebicte.v11i.210	Benchmark: Neural Network Indexing vs Classical B-Tree Architecture/Retrieval: Proposes a Learned Index Model, replacing traditional structures (B-Trees, Hash) with a Neural Network that acts as a mapping function. The NN approximates the Cumulative Distribution Function (CDF) of data t...	03-13 00:55	Success	-	View
exp_pytrain.20260313005152.049_20260313_005231 Paper: pytrain.20260313005152.049	Dynamic Type-Checked Plugin Loader Overview This benchmark tests the ability to design a robust plugin architecture using Python's `importlib` for dynamic module loading and `typing.Protocol` for structural sub-typing (duck typing with static-like hints). The Challenge Imple...	03-13 00:52	Success	-	View
exp_2409.11901v1_20260313_004956 Paper: 2409.11901v1	LLMs + Persona-Plug = Personalized LLMs Architecture: Proposes Persona-Plug, consisting of a frozen base LLM augmented by a lightweight, trainable User Embedder. This module aggregates all historical user contexts to generate a single, dense user-specific embedding ve...	03-13 00:50	Success	-	View
exp_cr_10.3390_app14062506_20260313_004850 Paper: cr_10.3390_app14062506	Sensor Data Retrieval Benchmark Architecture: A dual-stage pipeline comprising: (1) an LLM-based ETL component that normalizes unstructured sensor data into FAIR-compliant formats (offline), and (2) a retrieval component that creates semantic embeddings of entire tabu...	03-13 00:49	Success	-	View
exp_2403.17007v1_20260313_004753 Paper: 2403.17007v1	DreamLIP Benchmark Simulation Architecture: Standard dual-encoder (Vision Transformer + Text Transformer) utilizing a contrastive learning framework. It introduces a "grouping loss" and dynamic sub-caption sampling during training to align specific text chunks with...	03-13 00:48	Success	-	View
exp_2403.17998v1_20260313_004708 Paper: 2403.17998v1	T-MASS: Text Is MASS Benchmark Architecture: T-MASS replaces static text embeddings with stochastic distributions ("text masses") within a joint text-video embedding space. It employs a similarity-aware radius module to dynamically scale the semantic range of the...	03-13 00:47	Success	-	View
exp_pytrain.20260313004435.048_20260313_004509 Paper: pytrain.20260313004435.048	Type-Safe Plugin Loader for Namespace Packages This benchmark tests the ability to construct a robust, type-safe plugin architecture using Python's standard library. The focus is on leveraging `typing.Protocol` for interface definition, `typing.Generic` for container safety, and `import...	03-13 00:45	Success	-	View
exp_2309.07610v1_20260313_004240 Paper: 2309.07610v1	Feature Engineering in Learning-to-Rank for Community Question Answering Task Architecture: A hybrid Learning-to-Rank (LTR) framework that fuses sparse lexical features (BM25, TF-IDF) with dense semantic features derived from a BERT encoder. It explicitly utilizes features extracted from both questions and answer...	03-13 00:42	Success	-	View
exp_2309.10954v2_20260313_004141 Paper: 2309.10954v2	In-Context Learning for Text Classification with Many Labels Architecture: A retrieval-augmented ICL pipeline combining a pre-trained dense retrieval model with frozen LLMs (OPT, LLaMA). RAG Specifics: * Retrieval Architecture: Dense retrieval (bi-encoder). * Strategy: **Label Spa...	03-13 00:41	Success	-	View
exp_2309.12669v1_20260313_004038 Paper: 2309.12669v1	HRoT Benchmark Architecture & Retrieval Strategy: HRoT is a prompt-engineering framework combining a Retriever-Reader pipeline. It employs a Retrieval of Thought (RoT) mechanism, effectively treating reasoning retrieval as a task to fetch spec...	03-13 00:41	Success	-	View
exp_2309.14323v1_20260313_003944 Paper: 2309.14323v1	Cluster Language Model Benchmark Architecture: Proposes replacing global bi-encoders with Cluster Language Models (CLMs). Retrieval Strategy: * Indexing/Chunking: Uses K-Means to cluster queries based on semantic similarity. * Method: Fine-tunes a d...	03-13 00:40	Success	-	View
exp_pytrain.20260313003711.047_20260313_003745 Paper: pytrain.20260313003711.047	Strict Generic Registry & Packaging Benchmark This benchmark tests the ability to implement a robust, type-safe plugin registry using Python's advanced typing features (`Protocol`, `Generic`, `TypeVar`, `runtime_checkable`) within a simulated package structure (`__all__`). Drill Instru...	03-13 00:37	Success	-	View
exp_2303.13009v1_20260313_003530 Paper: 2303.13009v1	MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models Architecture: MELTR is a training-phase plug-in module utilizing a Transformer network and bi-level optimization (Approximate Implicit Differentiation) to dynamically combine multiple loss functions for fine-tuning video foundation...	03-13 00:35	Success	-	View
exp_2303.14617v1_20260313_003433 Paper: 2303.14617v1	Neural Graph Reasoning (NGDB) Benchmark This paper proposes Neural Graph Databases (NGDB) for Complex Logical Query Answering (CLQA), shifting retrieval from structural indices to latent reasoning. * Architecture: NGDB separates into a Neural Graph Storage (Graph/Feature/...	03-13 00:34	Success	-	View
exp_hf_2603.07392_20260313_003336 Paper: hf_2603.07392	Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams Assessment: OAKS Benchmark on Continual Knowledge Streams * Architecture: The paper introduces the OAKS benchmark to stress-test LLMs on evolving facts within streaming contexts. It evaluates 14 models, including base LLMs and *...	03-13 00:33	Success	-	View
exp_2512.14865v1_20260313_003227 Paper: 2512.14865v1	Audio MultiChallenge Benchmark Paper: Audio MultiChallenge (Benchmark) Architecture & Scope: This paper introduces Audio MultiChallenge, a benchmark for End-to-End (E2E) Spoken Dialogue Systems (SDS) that process raw audio without intermediate transcription....	03-13 00:32	Success	-	View
exp_pytrain.20260313002953.046_20260313_003035 Paper: pytrain.20260313002953.046	Strict Module Interface Validator Overview This benchmark simulates the initialization routine of a high-performance library (like vLLM or Diffusers). It tests the engine's ability to strictly enforce interface compliance before allowing a module to be loaded into the activ...	03-13 00:30	Success	-	View
exp_2512.14930v1_20260313_002809 Paper: 2512.14930v1	RMPMAB Benchmark: High-Content Microscopy Simulation Architecture: Proposes a Restless Multi-Process Multi-Armed Bandit (RMPMAB) framework. Instead of deep neural networks, it models imaging regions as ensembles of Markov chains to capture biological heterogeneity. It relies on scalable W...	03-13 00:28	Success	-	View
exp_oa_W4404354530_20260313_002701 Paper: oa_W4404354530	Small Language Model (SLM) Efficiency Benchmark This survey establishes Small Language Models (SLMs) as the optimal solution for hardware-constrained inference (e.g., 8GB VRAM). It redefines SLMs by capability and resource suitability, distinguishing them from massive LLMs like Llama-3.1...	03-13 00:27	Success	-	View
exp_cr_10.1609_aaai.v38i16.29765_20260313_002602 Paper: cr_10.1609_aaai.v38i16.29765	What Makes Quantization for Large Language Model Hard? An Empirical Study from the Lens of Perturbation Architecture: Introduces a "perturbation lens" framework, analyzing quantization error as additive noise to weights and activations. This theory supports a non-uniform quantization scheme that adapts grid spacing to activation sensitivi...	03-13 00:26	Success	-	View
exp_pytrain.20260313002319.045_20260313_002354 Paper: pytrain.20260313002319.045	Strictly Typed Protocol & Resource Packager This benchmark evaluates the implementation of a strictly-typed, dependency-free resource packager. It verifies the correct usage of modern Python typing constructs, specifically `Protocol`, `TypeGuard`, and `TypedDict`, while ensuring perf...	03-13 00:24	Success	-	View
exp_2309.16783v2_20260313_002156 Paper: 2309.16783v2	Photonic Image Segmentation Benchmark Summary: Photonic Accelerators for Image Segmentation * Architecture: The paper evaluates image segmentation DNNs adapted for analog photonic chips. It identifies that specific architectures (likely those with noise-resilient struct...	03-13 00:22	Success	-	View
exp_oa_W4416768581_20260313_002047 Paper: oa_W4416768581	This benchmark implements a "Deep Research" agent architecture based on the systematic survey provided. It decomposes a... Paper: Deep Research: A Systematic Survey Assessment: Conceptual Framework / Agentic Workflow Architecture: Proposes a "Deep Research" agentic framework with four components: Query Planning, Information Acquisition (tool...	03-13 00:21	Success	-	View
exp_2512.10435v1_20260313_001955 Paper: 2512.10435v1	SRAP: Semantic Reconstruction of Adversarial Plagiarism Benchmark Paper: Semantic Reconstruction of Adversarial Plagiarism (SRAP) Summary: Architecture & Retrieval Strategy SRAP utilizes a two-stage pipeline: 1. Anomaly Detection: A fine-tuned SciBERT (domain-specific MLM) calculates token...	03-13 00:20	Success	-	View
exp_2512.15766v1_20260313_001913 Paper: 2512.15766v1	LOOPRAG: Enhancing Loop Transformation Optimization with Retrieval-Augmented Large Language Models Architecture: LOOPRAG combines a Large Language Model (LLM) with a parameter-driven retrieval system and a feedback-based iterative mechanism that utilizes compilation and testing results for verification. **Retrieval Specifics:...	03-13 00:19	Success	-	View
exp_pytrain.20260313001653.044_20260313_001734 Paper: pytrain.20260313001653.044	Python Skill Fallback Title: Structural Subtyping and Dynamic Module Loading - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-13 00:17	Success	-	View
exp_2512.11509v2_20260313_001500 Paper: 2512.11509v2	This repository provides a lightweight, reproducible benchmark designed to evaluate the computational trade-offs of thre... Paper Summary: Does Less Hallucination Mean Less Creativity? This study benchmarks hallucination mitigation methods—Chain of Verification (CoVe), Decoding by Contrasting Layers (DoLa), and RAG—across LLaMA, Qwen, and Mistral...	03-13 00:15	Success	-	View
exp_2512.12084v1_20260313_001359 Paper: 2512.12084v1	FloodSQL-Bench FloodSQL-Bench is a benchmark for evaluating Text-to-SQL systems on complex, multi-table geospatial queries involving spatial and hybrid joins within a flood management domain. * Architecture: It assesses RAG-enhanced LLMs rather th...	03-13 00:14	Success	-	View
exp_2512.12281v1_20260313_001309 Paper: 2512.12281v1	Cognitive-YOLO Architecture Synthesis Benchmark Architecture: Cognitive-YOLO synthesizes YOLO-style object detection networks defined in a Neural Architecture Description Language (NADL), instantiated via a compiler. RAG & Retrieval: The LLM uses RAG to retrieve SOTA detectio...	03-13 00:13	Success	-	View
exp_2512.12885v1_20260313_001224 Paper: 2512.12885v1	SignRAG Pipeline Benchmark Architecture: A dual-stage generative pipeline. An input image is captioned by a Vision Language Model (VLM). This text query retrieves candidates from a vector database, which a Large Language Model (LLM) synthesizes for final classifi...	03-13 00:12	Success	-	View
exp_pytrain.20260313001008.043_20260313_001046 Paper: pytrain.20260313001008.043	Asynchronous Type-Safe Asset Manifestor Overview This benchmark evaluates a Python CLI tool's ability to strictly enforce static typing using `typing.TypedDict` and `TypeAlias`, while correctly implementing `asyncio` for concurrent file processing. The Challenge The script (`mani...	03-13 00:10	Success	-	View
exp_2512.13059v1_20260313_000935 Paper: 2512.13059v1	An Open and Reproducible Deep Research Agent for Long-Form Question Answering Architecture: Iterative agentic workflow combining an LLM controller with a live Open Web Search API for retrieval, reasoning, and synthesis. RAG Strategy: * Retrieval: Live Web Search API (no static vector database). * **Indexi...	03-13 00:09	Success	-	View
exp_2512.13237v1_20260313_000804 Paper: 2512.13237v1	Learning to Retrieve with Weakened Labels: Robust Training under Label Noise Architecture & Training: This paper introduces a training methodology—Label Weakening—for standard Neural Encoders (Bi-Encoders) and Cross-Encoder rerankers. Instead of relying on single, potentially erroneous hard labels, the appro...	03-13 00:08	Success	-	View
exp_2601.10718v1_20260313_000722 Paper: 2601.10718v1	HPV AI Agent System Benchmark Architecture: ReAct Agent with RAG and multi-tool orchestration across five heterogeneous sources. Includes a secondary pipeline for automated report generation (sentiment/synthesis). RAG Details: * Retrieval: Vector databas...	03-13 00:07	Success	-	View
exp_2512.13573v2_20260313_000636 Paper: 2512.13573v2	MMhops-R1: Multimodal Multi-hop Reasoning Benchmark Architecture: MMhops-R1 is a multimodal Retrieval-Augmented Generation (mRAG) framework utilizing Reinforcement Learning (RL) to autonomously plan reasoning paths, generate targeted queries, and synthesize multi-level information. **Ret...	03-13 00:06	Success	-	View
exp_2512.14766v1_20260313_000556 Paper: 2512.14766v1	GR-Agent: Adaptive Graph Reasoning Benchmark Architecture: GR-Agent formalizes Knowledge Graph Question Answering (KGQA) as an agentic interaction loop, utilizing an LLM controller with access to specific graph reasoning tools. Retrieval Strategy: The *retrieval architecture...	03-13 00:06	Success	-	View
exp_pytrain.20260313000314.042_20260313_000354 Paper: pytrain.20260313000314.042	Robust Generic Service Container using PEP 695 This coding drill benchmark verifies the implementation of a generic `ServiceContainer` class utilizing PEP 695 Type Parameter Syntax (available in Python 3.12+). Features * Modern Type Syntax: Uses the new `class ClassName[T]:` syn...	03-13 00:03	Success	-	View
exp_2512.14792v1_20260313_000135 Paper: 2512.14792v1	IaC Generation with LLMs: An Error Taxonomy and A Study on Configuration Knowledge Injection Architecture & Retrieval Strategy: This paper implements a Graph RAG framework designed to enhance IaC (Terraform) generation. The retrieval architecture evolves from Naive RAG to a Knowledge Graph (KG) approach. It employs **semant...	03-13 00:01	Success	-	View
exp_cr_10.3390_info16090804_20260313_000046 Paper: cr_10.3390_info16090804	Secure Multifaceted-RAG (SecMulti-RAG) Benchmark Paper: Secure Multifaceted-RAG (SecMulti-RAG) Architecture & Retrieval: A hybrid RAG framework utilizing three knowledge sources: internal documents, pre-generated "Expert Knowledge" (static cache), and on-demand external LLM genera...	03-13 00:01	Success	-	View
exp_2506.12494v2_20260313_000001 Paper: 2506.12494v2	FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation Architecture: Modular framework supporting text-based, multimodal, and network-based retrieval architectures. RAG Specs: Abstracts the retrieval pipeline; chunking and indexing strategies are user-defined (pluggable) rather...	03-13 00:00	Success	-	View
exp_2506.13743v1_20260312_235908 Paper: 2506.13743v1	LTRR: Learning To Rank Retrievers for LLMs Paper: LTRR: Learning To Rank Retrievers for LLMs Architecture: LTRR implements a Query Routing strategy using a Learning-to-Rank (LTR) model (specifically XGBoost) to dynamically select the optimal retriever from a heterogeneou...	03-12 23:59	Success	-	View
exp_pytrain.20260312235706.041_20260312_235725 Paper: pytrain.20260312235706.041	Type-Safe Plugin Dispatcher Benchmark This project demonstrates a robust, modular plugin architecture using Python's `typing.Protocol` and `@runtime_checkable` decorators. It simulates the behavior of Python packaging entry points (like `setup.py` entry points or `pyproject.tom...	03-12 23:57	Success	-	View
exp_2506.14084v1_20260312_235522 Paper: 2506.14084v1	Lightweight Relevance Grader in RAG Architecture: Fine-tuned Llama-3.2-1B deployed as a binary relevance grader (classifier) within a RAG pipeline to filter documents post-retrieval. Memory Footprint: Extreme efficiency. At 1B parameters, the model requires ~2GB VRAM...	03-12 23:55	Success	-	View
exp_2506.14516v2_20260312_235418 Paper: 2506.14516v2	Benchmark for G-RAG: Generation-Retrieval-Augmented Generation Architecture: A "Generation-Retrieval-Augmented Generation" (G-RAG) pipeline. Retrieval & Reranking Strategy: The system employs HyDE (Hypothetical Document Embeddings), where the LLM generates a synthetic answer to augment retr...	03-12 23:54	Success	-	View
exp_2506.14529v1_20260312_235333 Paper: 2506.14529v1	Automated Decision-Making on Networks with LLMs through Knowledge-Guided Evolution Architecture: LLMNet is an agentic AutoML framework, not a standalone inference model. It employs LLM agents to iteratively design and refine GNN architectures via a knowledge-guided evolutionary process. RAG & Retrieval: Uses RAG t...	03-12 23:53	Success	-	View
exp_cr_10.3390_math13050856_20260312_235237 Paper: cr_10.3390_math13050856	Benchmark Design: RAG Hallucination Mitigation via Grounded Constraints Paper Type: Comprehensive Survey. Architecture: Reviews standard RAG frameworks (Retriever + LLM), analyzing hallucination sources (confabulations) in both retrieval (missed top-k) and generation (ignoring context) sub-tasks. **RAG...	03-12 23:52	Success	-	View
exp_pytrain.20260312235018.040_20260312_235045 Paper: pytrain.20260312235018.040	Dynamic Package Injection and Protocol Verification This benchmark tests the ability to generate Python package structures dynamically at runtime, inject them into the Python interpreter path, and enforce strict type compliance using `typing.Protocol`. Objective 1. Dynamic Packaging: Pro...	03-12 23:50	Success	-	View
exp_cr_10.1038_s41746-025-01536-y_20260312_234843 Paper: cr_10.1038_s41746-025-01536-y	Evaluating LLMs vs. RAG in Neurology: Benchmark Suite Evaluation Scope: Clinical performance comparison of Base LLMs vs. Retrieval-Augmented Generation (RAG) in neurology. Architecture: * RAG Variants: "Document-enabled" (static guidelines) and "Online-enabled" (live web search). *...	03-12 23:49	Success	-	View
exp_cr_10.1007_s10278-025-01483-w_20260312_234803 Paper: cr_10.1007_s10278-025-01483-w	Evaluation of a Retrieval-Augmented Generation-Powered Chatbot for Pre-CT Informed Consent: a Prospective Comparative St... Status: Technical specifications omitted. This paper is a clinical outcome study, not an engineering report. Essential architectural details for the ARES 8GB roadmap are not disclosed: * Architecture: The underlying LLM (e.g., L...	03-12 23:48	Success	-	View
exp_2410.00005v1_20260312_234714 Paper: 2410.00005v1	Benchmark: Meta KDD Cup '24 Winning Solution (CRAG System) Architecture: Hybrid RAG system combining unstructured web search with structured Knowledge Graph (KG) access via tool use. Retrieval Strategy: Uses a "regularized API set" where a tuned LLM generates specific API calls to query the...	03-12 23:47	Success	-	View
exp_2409.09510v2_20260312_234624 Paper: 2409.09510v2	Personalization Benchmark: RAG vs. PEFT Summary This paper evaluates RAG versus Parameter-Efficient Fine-Tuning (PEFT) for privacy-preserving LLM personalization on the LaMP benchmark. Architecture: Contrasts standard RAG (prompt enrichment) against PEFT (likely LoRA/Adap...	03-12 23:46	Success	-	View
exp_2409.09582v2_20260312_234541 Paper: 2409.09582v2	NEVLP Benchmark Implementation Architecture: NEVLP bridges a frozen image encoder and a frozen LLM using a trainable Transformer connector. It optimizes training via noise-adaptive learning (estimating noise probabilities) and concept-enhanced learning (i...	03-12 23:45	Success	-	View
exp_pytrain.20260312234343.039_20260312_234409 Paper: pytrain.20260312234343.039	Python Skill Fallback Title: Type-Safe Dynamic Backend Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 23:44	Success	-	View
exp_2409.18986v2_20260312_234307 Paper: 2409.18986v2	Lab-AI: Using Retrieval Augmentation to Enhance Language Models for Personalized Lab Test Interpretation in Clinical Med... Architecture & Feasibility: Lab-AI utilizes a two-stage RAG pipeline: Factor Retrieval (identifying patient demographics) followed by Normal Range Retrieval (fetching conditional reference data), orchestrated via GPT-4-turbo. Th...	03-12 23:43	Success	-	View
exp_2409.10825v5_20260312_234139 Paper: 2409.10825v5	Benchmark: Bias Mitigation in LLM Recommendations Architecture: Evaluates off-the-shelf LLMs (LLaMA, GPT, Gemini) for recommendation tasks; proposes a Retrieval-Augmented Generation (RAG) framework to mitigate algorithmic bias by retrieving diverse candidates to counteract skewed train...	03-12 23:41	Success	-	View
exp_2409.11279v1_20260312_234058 Paper: 2409.11279v1	P-RAG: Progressive Retrieval Augmented Generation Benchmark Architecture: LLM-based agent utilizing an iterative, self-updating retrieval loop. Retrieval Strategy: Progressive RAG. Unlike static RAG, it accumulates "experiences" (historical interactions) into a dynamic database. It uses a gr...	03-12 23:41	Success	-	View
exp_2409.12140v2_20260312_234014 Paper: 2409.12140v2	MoRAG Benchmark: Evaluating Multi-Fusion Retrieval & SSM Optimization Architecture: MoRAG augments motion diffusion models via a dual-module pipeline: an LLM for query normalization (spelling/rephrasing) and a multi-part retriever that performs spatial composition of part-specific motion features. **RAG S...	03-12 23:40	Success	-	View
exp_2409.12519v3_20260312_233929 Paper: 2409.12519v3	This repository contains the runnable benchmark code for the Multi-View Adaptive Contrastive Learning for Information Re... Architecture: MACL-IRFL utilizes Graph Neural Networks (GNNs) combined with Adaptive Contrastive Learning. It generates embeddings by aggregating information from three specific graph views: report-code interaction, report-report simila...	03-12 23:39	Success	-	View
exp_pytrain.20260312233713.038_20260312_233741 Paper: pytrain.20260312233713.038	Python Skill Fallback Title: Typed Configuration Package Module - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 23:37	Success	-	View
exp_2409.12941v3_20260312_233633 Paper: 2409.12941v3	Fact, Fetch, and Reason (FRAMES) Benchmark Paper Summary: Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation Focus: Evaluation Benchmark (FRAMES) for Multi-hop RAG. * Retrieval Architecture: The paper proposes a **multi-step retrieval pipel...	03-12 23:36	Success	-	View
exp_2409.13537v1_20260312_233514 Paper: 2409.13537v1	ShizishanGPT: An Agricultural Large Language Model Integrating Tools and Resources Architecture ShizishanGPT is a modular agent framework integrating a Retrieval Augmented Generation (RAG) pipeline with an Agricultural Knowledge Graph (KG) and external tool execution. It relies on a heavy GPT-4 backbone for generi...	03-12 23:35	Success	-	View
exp_2409.14083v1_20260312_233420 Paper: 2409.14083v1	SURf Benchmark Suite Architecture: SURf is a self-refinement fine-tuning framework for LVLMs. It constructs training sets using positive (corrective) and negative (misleading) multimodal references to teach the model backbone how to selectively filter retri...	03-12 23:34	Success	-	View
exp_2403.12582v1_20260312_233338 Paper: 2403.12582v1	README: AlphaFin Benchmarking Suite Paper: AlphaFin (Stock-Chain) Architecture: A retrieval-augmented generation (RAG) framework trained on the AlphaFin benchmark, combining real-time financial data with handwritten chain-of-thought (CoT) reasoning. RAG Specifics:...	03-12 23:33	Success	-	View
exp_2404.10779v1_20260312_233240 Paper: 2404.10779v1	Fine-Tuning LLM for Enterprise: Benchmark Suite Architecture: Focuses on fine-tuning open-weight models (specifically LLaMA) on proprietary enterprise data (documentation and code) to surpass standard Retrieval-Augmented Generation (RAG) quality, arguing RAG is limited by vector data...	03-12 23:32	Success	-	View
exp_pytrain.20260312233045.037_20260312_233104 Paper: pytrain.20260312233045.037	Python Skill Fallback Title: Runtime-Typed Plugin Loader with Dynamic Package Discovery - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 23:31	Success	-	View
exp_2403.17428v2_20260312_232919 Paper: 2403.17428v2	Aligning Large Language Models for Enhancing Psychiatric Interviews Through Symptom Delineation and Summarization: Pilot... Architecture: Proposes a multi-stage pipeline: (1) Stressor Extraction (NER), (2) Symptom Section Identification (Span Detection), and (3) Summarization using extracted context. RAG Strategy: The paper explicitly states RAG showed *...	03-12 23:29	Success	-	View
exp_2403.17645v3_20260312_232826 Paper: 2403.17645v3	DANCER: Entity Description Augmented Named Entity Corrector for Automatic Speech Recognition Architecture: DANCER proposes an Efficient Entity Description Augmented Masked Language Model (EDA-MLM) for post-ASR error correction. It replaces traditional phonetic edit-distance algorithms with a hybrid **dense retrieval + Masked La...	03-12 23:28	Success	-	View
exp_2403.17848v1_20260312_232656 Paper: 2403.17848v1	ArabicaQA Benchmark Suite Paper: ArabicaQA: A Comprehensive Dataset for Arabic Question Answering Summary for ARES 8GB Roadmap: * Architecture: The paper introduces AraDPR, a Dense Passage Retrieval (DPR) model (Dual-encoder BERT-based) tailored...	03-12 23:27	Success	-	View
exp_2309.11322v2_20260312_232615 Paper: 2309.11322v2	Vector database management systems: Fundamental concepts, use-cases, and current challenges Architecture: Narrative review of Vector Database Management Systems (VDBMS) designed for high-dimensional, sparse data. RAG Specifics: * Retrieval Architecture: Approximate Nearest Neighbor (ANN) similarity search. * **Indexing...	03-12 23:26	Success	-	View
exp_pytrain.20260312232355.036_20260312_232431 Paper: pytrain.20260312232355.036	Python Skill Fallback Title: Dynamic Namespace Loader with Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 23:24	Success	-	View
exp_2309.12132v2_20260312_232208 Paper: 2309.12132v2	Benchmark Design: GraphRAG vs. Vanilla LLM for Contract Review Architecture: A tuning-free GraphRAG framework combining LLMs with a Nested Contract Knowledge Graph (NCKG). Retrieval Strategy: Utilizes NCKG-based graph traversal instead of vector chunking. The system indexes contract clauses...	03-12 23:22	Success	-	View
exp_2309.15427v2_20260312_232118 Paper: 2309.15427v2	Graph Neural Prompting (GNP) Benchmark Architecture: GNP augments a frozen LLM with a trainable Graph Neural Network (GNN) encoder and a domain projector. It extracts embeddings from Knowledge Graph (KG) subgraphs and converts them into continuous "soft prompts" to guide the...	03-12 23:21	Success	-	View
exp_2309.16035v3_20260312_232021 Paper: 2309.16035v3	MKRAG Efficiency Benchmark Architecture: Standard RAG pipeline coupling a retrieval encoder with a Vicuna-7B generator. Avoids fine-tuning, relying on prompt injection for domain adaptation. Retrieval Strategy: Extracts facts from the MedQA-SMILE dataset. Spe...	03-12 23:20	Success	-	View
exp_2303.14369v1_20260312_231937 Paper: 2303.14369v1	Benchmark Design for HBI (Hierarchical Banzhaf Interaction) Architecture: Proposes Hierarchical Banzhaf Interaction (HBI), modeling video frames and text words as cooperative game players. It stacks token-merge modules to cluster inputs and compute fine-grained interactions at multiple semantic...	03-12 23:19	Success	-	View
exp_pytrain.20260312231732.035_20260312_231759 Paper: pytrain.20260312231732.035	Python Skill Fallback Title: Generic Data Store & CLI Module - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 23:18	Success	-	View
exp_2303.16145v1_20260312_230544 Paper: 2303.16145v1	Benchmark: NeuralMind-UNICAMP mT5 CLIR Reranker Architecture: Utilizes mT5-XXL (approx. 11B parameters) as a cross-lingual reranker within a two-stage retrieval pipeline. Retrieval & Context: * 1st Stage: Sparse retrieval (BM25). * 2nd Stage: mT5-XXL reranks query-doc...	03-12 23:15	Success	-	View
exp_2304.01003v1_20260312_230449 Paper: 2304.01003v1	QUADRo: Dataset and Models for QUestion-Answer Database Retrieval Paper: QUADRo: Dataset and Models for QUestion-Answer Database Retrieval Summary: Architecture: A dual-stage Neural IR pipeline utilizing a Bi-Encoder for retrieval and a Cross-Encoder for reranking. The system encodes b...	03-12 23:04	Success	-	View
exp_pytrain.20260312230140.034_20260312_230225 Paper: pytrain.20260312230140.034	Python Skill Fallback Title: Strictly-Typed Model Artifact Packager - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 23:02	Success	-	View
exp_hf_2603.09229_20260312_225857 Paper: hf_2603.09229	Flash-KMeans: Fast and Memory-Efficient Exact K-Means Architecture: Flash-KMeans replaces standard GPU K-means stages with two kernel-level innovations. FlashAssign fuses distance computation with online argmin selection, bypassing intermediate memory writes. The *sort-inverse update...	03-12 22:59	Success	-	View
exp_hf_2603.10702_20260312_225727 Paper: hf_2603.10702	UniCom: Unified Multimodal Modeling Benchmark Architecture UniCom utilizes a transfusion architecture (superior to query-based designs) featuring an attention-based semantic compressor. It generates compact, continuous semantic representations by prioritizing channel reduct...	03-12 22:57	Success	-	View
exp_cr_10.3390_sym17030471_20260312_225621 Paper: cr_10.3390_sym17030471	Benchmark: Improved Model-Free Adaptive Predictive Control (MFAPC) Verdict: Incompatible This paper addresses Control Theory (Model-Free Adaptive Predictive Control), not Deep Learning. It focuses on networked cyber-physical systems under DoS attacks and does not describe a neural network architect...	03-12 22:56	Success	-	View
exp_pytrain.20260312225328.033_20260312_225403 Paper: pytrain.20260312225328.033	Strictly-Typed Kernel Registry Benchmark Overview This benchmark simulates a high-performance kernel registration subsystem similar to those found in vLLM or PyTorch. It tests the hypothesis that enforcing strict `typing.Protocol` constraints at import-time reduces runtime errors...	03-12 22:54	Success	-	View
exp_cr_10.1609_aaai.v38i17.29815_20260312_225059 Paper: cr_10.1609_aaai.v38i17.29815	Benchmark for Norm Tweaking in Low-Bit Quantization Architecture: A plugin for existing Post-Training Quantization (PTQ) pipelines. It does not alter core Transformer blocks but modifies Layer Normalization weights. The method aligns the distribution of quantized activations with their f...	03-12 22:51	Success	-	View
exp_2512.10596v1_20260312_224927 Paper: 2512.10596v1	Benchmark: Beyond Pixels (T2T Retrieval) Architecture: Proposes TRSLLaVA, a training-free framework converting cross-modal retrieval into Text-to-Text (T2T) matching. It replaces vision encoders with a VLM (LLaVA) to generate structured captions for images, aligning th...	03-12 22:49	Success	-	View
exp_2512.14102v1_20260312_224836 Paper: 2512.14102v1	Neurosymbolic Inference On Foundation Models For Remote Sensing Text-to-image Retrieval With Complex Queries Architecture: Neurosymbolic framework (RUNE) combining Large Language Models (LLMs), object detectors, and First-Order Logic (FOL). It treats text-to-image retrieval as a symbolic reasoning task rather than implicit vector matching. **R...	03-12 22:48	Success	-	View
exp_cr_10.14419_dzzstd42_20260312_224734 Paper: cr_10.14419_dzzstd42	DNGR: Deep Neural Graph-Based Recommendation System for Scholarly Paper Retrieval Architecture: DNGR couples Graph Neural Networks (GNNs) with SciBERT embeddings, processing a heterogeneous academic graph of citations, authors, and topics. Retrieval & RAG Details: * Architecture: Deep Neural Graph-based Recom...	03-12 22:47	Success	-	View
exp_pytrain.20260312224454.032_20260312_224533 Paper: pytrain.20260312224454.032	Type-Safe Plugin Registry for Model Configurations This benchmark evaluates a Python coding system's ability to implement robust, type-safe package architecture using standard library features. The task is to construct a modular "model registry" system similar to those found in enterprise A...	03-12 22:45	Success	-	View
exp_2506.14445v1_20260312_224302 Paper: 2506.14445v1	Vela: Multimodal Embedding Benchmark Architecture: Vela repurposes a frozen Voice Large Language Model (vLLM) as a dual-encoder to generate unified multimodal embeddings. It bridges the text-audio gap using prompt engineering and in-context learning, training exclusively o...	03-12 22:43	Success	-	View
exp_2409.09721v2_20260312_224158 Paper: 2409.09721v2	Finetuning CLIP to Reason about Pairwise Differences Architecture: Standard CLIP dual-encoder (ViT + Text Transformer) finetuned via contrastive learning on synthetic LLM-generated data to align image embedding differences ($I_1 - I_2$) with text descriptions of differences. **Memory Foot...	03-12 22:42	Success	-	View
exp_2403.15378v3_20260312_224059 Paper: 2403.15378v3	Long-CLIP: Unlocking Long-Text Capability Benchmark Architecture: Long-CLIP addresses CLIP’s 77-token limit via two efficient fine-tuning strategies: knowledge-preserved stretching of positional embeddings and primary component matching of features. This preserves the original latent spa...	03-12 22:41	Success	-	View
exp_pytrain.20260312223822.031_20260312_223854 Paper: pytrain.20260312223822.031	Strict Protocol Plugin Loader Benchmark This benchmark tests the hypothesis that combining `typing.Protocol` with `importlib` allows for a robust, zero-dependency plugin system that validates interfaces at runtime without manual registration. Instructions 1. Save the code below a...	03-12 22:38	Success	-	View
exp_2403.16265v1_20260312_223608 Paper: 2403.16265v1	Benchmark: Graph-Augmented Patent Phrase Similarity Architecture: Hybrid retrieval-augmented encoder combining a standard contextualized model (e.g., BERT) with a Graph Neural Network (GNN). Retrieval Architecture: Citation-Graph Retrieval. Instead of standard chunking, it constr...	03-12 22:36	Success	-	View
exp_cr_10.1609_aaai.v38i8.28714_20260312_223443 Paper: cr_10.1609_aaai.v38i8.28714	UniGen: Unified Retrieval and QA Benchmark Architecture: Dual-decoder Transformer (Shared Encoder + Generative Retrieval Decoder + QA Decoder). Utilizes LLM-generated connectors to bridge query-to-doc and doc-to-answer representations. Retrieval Strategy: **Generative Docume...	03-12 22:34	Success	-	View
exp_2303.11313v3_20260312_223349 Paper: 2303.11313v3	CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition Paper: CLIP goes 3D (CG3D) Architecture Introduces a learnable 3D point cloud encoder aligned with frozen CLIP (Vision/Text) encoders. It uses contrastive loss on triplets of (Pointcloud, Rendered Image, Text). *Retrieval Strategy...	03-12 22:33	Success	-	View
exp_pytrain.20260312223032.030_20260312_223108 Paper: pytrain.20260312223032.030	Python Skill Fallback Title: Type-Annotated Async Fetcher with Package Structure - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 22:31	Success	-	View
exp_2309.16889v2_20260312_222904 Paper: 2309.16889v2	Superpixel Transformers for Efficient Semantic Segmentation Architecture: The model replaces dense pixel processing with a Superpixel Transformer backbone. It utilizes local cross-attention to dynamically map pixels to a reduced set of "superpixel" tokens. Standard multi-head self-attention is t...	03-12 22:29	Success	-	View
exp_2512.11506v2_20260312_222800 Paper: 2512.11506v2	EmeraldMind Benchmark EmeraldMind Summary * Architecture: A GraphRAG framework integrating a domain-specific Knowledge Graph (EmeraldGraph) with an LLM to verify claims against ESG reports. * Memory Footprint: High Efficiency. The heavy memor...	03-12 22:28	Success	-	View
exp_2512.14744v1_20260312_222708 Paper: 2512.14744v1	VERAFI: Verified Agentic Financial Intelligence through Neurosymbolic Policy Generation Architecture: Neurosymbolic Agentic framework combining Dense Retrieval, Cross-Encoder Reranking, and automated reasoning policies (GAAP/SEC/Math validation). RAG Specs: Dense Retrieval + Cross-Encoder Reranking. No specific chu...	03-12 22:27	Success	-	View
exp_pytrain.20260312222417.029_20260312_222454 Paper: pytrain.20260312222417.029	Package Metadata & Type Coverage Verifier This benchmark evaluates the ability to construct a static analysis tool using Python's standard library. The goal is to inspect a namespace (simulated by `globals()`) to verify packaging compliance (checking `__all__` integrity) and type c...	03-12 22:24	Success	-	View
exp_2601.06039v1_20260312_222232 Paper: 2601.06039v1	Operation Veja: VEJA Framework Benchmark Architecture: None. This is a data curation framework, not a model architecture proposal. Methodology: Introduces the VEJA paradigm (Values, Experiences, Judgments, Abilities) to generate training data that fosters "deliberative...	03-12 22:22	Success	-	View
exp_2512.12858v1_20260312_222122 Paper: 2512.12858v1	Benchmark: Information-Consistent LM Recommendations (GRPO) Architecture: Proposes a reinforcement learning framework utilizing Group Relative Policy Optimization (GRPO) to minimize output variance across semantically equivalent prompt groups. This is a model alignment/training technique, not a...	03-12 22:21	Success	-	View
exp_pytrain.20260312221728.028_20260312_221831 Paper: pytrain.20260312221728.028	Generic Plugin Registry with PEP 695 Type Parameters Overview This benchmark evaluates the design and implementation of a strictly-typed Plugin Registry system utilizing Python 3.12+ features, specifically PEP 695 Type Parameter Syntax. Features - PEP 695 Syntax: Uses the new `cla...	03-12 22:18	Success	-	View
exp_2512.13074v1_20260312_221513 Paper: 2512.13074v1	Benchmark: Symmetric Consistent Indexing (SCI) for Dense Retrieval Architecture: SCI enhances standard dual-tower dense retrieval by addressing representational misalignment and training-inference inconsistency. Retrieval Specs: * Indexing Strategy: Implements Dual-view indexing to ensu...	03-12 22:15	Success	-	View
exp_2512.14762v1_20260312_221355 Paper: 2512.14762v1	Benchmark: Workflows vs Agents for Code Translation Architecture: Compares fixed workflows against an MCP-based agentic framework for MATLAB-to-HDL syntax repair. The agent architecture dynamically selects tools rather than following a static chain. Retrieval & Context: Utilizes...	03-12 22:14	Success	-	View
exp_2512.14179v1_20260312_221212 Paper: 2512.14179v1	Benchmark: RAG Pipelines for Bengali Dialect Translation Validation: This paper validates the ARES 8GB strategy, demonstrating that retrieval augmentation allows an 8B parameter model (Llama-3.1-8B) to outperform 120B-class models in low-resource translation. Retrieval Architecture: The s...	03-12 22:12	Success	-	View
exp_pytrain.20260312220925.027_20260312_220959 Paper: pytrain.20260312220925.027	Type-Safe Plugin Registry Factory Overview This coding drill challenges you to implement a robust, generic Plugin Registry system in Python, inspired by the extensibility mechanisms found in frameworks like PyTorch and LitGPT. Objective Create a `PluginRegistry` class that...	03-12 22:10	Success	-	View
exp_2512.14417v1_20260312_220657 Paper: 2512.14417v1	PortAgent: LLM-driven Vehicle Dispatching Agent for Port Terminals Architecture: Multi-agent framework (Virtual Expert Team) utilizing four specialized roles (Retriever, Modeler, Coder, Debugger) and a Reflexion-inspired self-correction loop to automate vehicle dispatching logic. *RAG Implementation:...	03-12 22:07	Success	-	View
exp_cr_10.64552_wipiec.v11i1.95_20260312_220532 Paper: cr_10.64552_wipiec.v11i1.95	MicroRAG Benchmark Architecture: RAG-based framework targeting technical microarchitecture documentation (AURIX TriCore). Memory Footprint: The study validates 3B and 8B parameter models against a 72B baseline. An 8B model is highly suitable for 8GB V...	03-12 22:06	Success	-	View
exp_pytrain.20260312220144.026_20260312_220237 Paper: pytrain.20260312220144.026	Generic Pipeline Registry Benchmark This benchmark evaluates a Python implementation of a modular processing pipeline using modern typing features (`typing.Protocol`, `typing.TypeVar`, `typing.Generic`). Key Concepts * `ProcessingStep` Protocol: Defines the contract for a...	03-12 22:02	Success	-	View
exp_cr_10.3390_info16090766_20260312_214913 Paper: cr_10.3390_info16090766	This repository contains the benchmarking code for the paper titled "Retrieval-Augmented Generation vs. Baseline LLMs:... Analysis for ARES 8GB Roadmap:** * Architecture: Evaluates RAG-augmented performance against baselines for TinyLlama (1.1B), Mistral (7B), Llama 3.1 (8B), and Llama 1 (13B). * RAG Specifics: The abstract lacks technical specifics...	03-12 21:59	Success	-	View
exp_cr_10.3390_info16090786_20260312_214825 Paper: cr_10.3390_info16090786	Analysis of Large Language Models for Company Annual Reports Based on Retrieval-Augmented Generation Paper Type: Evaluation Study (Proprietary Models) Summary: This paper assesses the performance of cloud-based LLMs (ChatGPT-4, Gemini) enhanced with Retrieval-Augmented Generation (RAG) for analyzing financial annual reports. * **Ar...	03-12 21:48	Success	-	View
exp_cr_10.3390_computers14090382_20260312_214740 Paper: cr_10.3390_computers14090382	GraphTrace: A Modular Retrieval Framework Combining Knowledge Graphs and Large Language Models for Multi-Hop Question An... Architecture: Modular Graph-based RAG utilizing a Knowledge Graph (KG) rather than vector stores. Retrieval Strategy: * Indexing: Structured entity relationships (domain-specific KG), bypassing traditional text chunking. * **Pro...	03-12 21:47	Success	-	View
exp_cr_10.32996_jcsts.2025.7.9.56_20260312_214637 Paper: cr_10.32996_jcsts.2025.7.9.56	This benchmark simulates the performance characteristics of the described "Contextual Retrieval-Augmented Generation" ar... Architecture: Serverless RAG pipeline utilizing AWS Kendra for retrieval and external Claude API for generation, orchestrated via API Gateway and Lambda. Retrieval & Context: * Retrieval Architecture: AWS Kendra (Managed...	03-12 21:46	Success	-	View
exp_pytrain.20260312214356.025_20260312_214425 Paper: pytrain.20260312214356.025	Robust Plug-in Loader with Runtime Protocol Verification This benchmark evaluates the design of a type-safe, extensible plugin architecture using Python's `typing.Protocol` and `@runtime_checkable`. Overview The script implements a simulated data processing package. It defines a strict behavioral...	03-12 21:44	Success	-	View
exp_cr_10.3390_electronics14183676_20260312_214201 Paper: cr_10.3390_electronics14183676	Enhancing Clinical Named Entity Recognition via Fine-Tuned BERT and Dictionary-Infused Retrieval-Augmented Generation Architecture: Two-stage pipeline. Stage 1 utilizes a fine-tuned BERT for clinical NER. Stage 2 employs a Dictionary-Infused Retrieval-Augmented Generation (DiRAG) module for terminology normalization, merging semantic retrieval with...	03-12 21:42	Success	-	View
exp_cr_10.3390_biomimetics10090626_20260312_214043 Paper: cr_10.3390_biomimetics10090626	Benchmark: Biomimicry Design Spiral RAG Framework Architecture: A specialized, stage-specific RAG framework coupling a locally hosted Llama 3.1 model with a domain-specific AskNature corpus (2,106 documents) to facilitate the Biomimicry Design Spiral (BSD). RAG Specifics: *...	03-12 21:40	Success	-	View
exp_oa_W4410600121_20260312_213933 Paper: oa_W4410600121	Document GraphRAG Benchmark Architecture: Knowledge Graph-enhanced RAG (GraphRAG) leveraging document-intrinsic structure. Retrieval & Indexing: Uses graph-based document structuring and keyword-based semantic linking. It optimizes retrieval by tuning...	03-12 21:39	Success	-	View
exp_pytrain.20260312213630.024_20260312_213706 Paper: pytrain.20260312213630.024	Strictly-Typed Plugin Dispatcher Benchmark Objective This benchmark evaluates a Python implementation of a modular plugin dispatcher. It validates the hypothesis that utilizing Structural Subtyping (via `typing.Protocol`) combined with Generics ensures strict adherence to component...	03-12 21:37	Success	-	View
exp_2506.12637v2_20260312_213448 Paper: 2506.12637v2	How Grounded is Wikipedia? A Study on Structured Evidential Support and Retrieval Assessment: This is a study and dataset release (PeopleProfiles), not a novel model architecture. It evaluates the reliability of Wikipedia citations and the efficacy of retrieval systems in finding supporting evidence. **Retrieval...	03-12 21:34	Success	-	View
exp_2506.12895v1_20260312_213342 Paper: 2506.12895v1	Legal IR Performance Benchmark Architecture: Comparative analysis of Lexical (BM25) vs. Dense Retrieval (Transformer-based Bi-encoders). Retrieval Strategy: Passage-level retrieval of legal decisions. Key Findings: Off-the-shelf dense models underperf...	03-12 21:33	Success	-	View
exp_2506.14086v1_20260312_213214 Paper: 2506.14086v1	InsertRank: Benchmark Suite InsertRank employs a Listwise Reranking architecture. It integrates BM25 lexical scores directly into the LLM prompt, allowing the model to reason over retrieval signals rather than just semantic text. * **Retrieval Architecture...	03-12 21:32	Success	-	View
exp_pytrain.20260312212933.023_20260312_213008 Paper: pytrain.20260312212933.023	Strict Type ZipApp Bundler Overview This project provides a robust CLI tool that enforces strict typing within Python source files before bundling them into a portable ZipApp (`.pyz`) executable. Hypothesis An autonomous coding system demonstrates advanced capability...	03-12 21:30	Success	-	View
exp_2506.14336v1_20260312_212905 Paper: 2506.14336v1	AviationLLM Benchmark: RALA-DPO vs Base SFT Architecture: RALA-DPO utilizes a Qwen base model, fine-tuned via Direct Preference Optimization (DPO) and enhanced with Retrieval-Augmented Generation (RAG). RAG Pipeline: The abstract confirms RAG usage to mitigate hal...	03-12 21:29	Success	-	View
exp_2506.14488v1_20260312_212747 Paper: 2506.14488v1	Benchmark: Retrieval-Enhanced Aligned Diffusion (READ) Architecture: READ integrates an SE(3)-equivariant diffusion model with a contrastively pre-trained graph encoder to align atom-level representations. RAG Specifics: * Retrieval Architecture: Graph-based retrieval using a pre-tr...	03-12 21:27	Success	-	View
exp_2506.15241v1_20260312_212707 Paper: 2506.15241v1	Research on Graph-Retrieval Augmented Generation Based on Historical Text Knowledge Graphs Architecture: A GraphRAG framework combining Knowledge Graph (KG) retrieval with Chain-of-Thought (CoT) prompting. It utilizes a collaborative KG-LLM mechanism to improve entity alignment and reduce hallucinations in historical text ana...	03-12 21:27	Success	-	View
exp_2506.15415v1_20260312_212617 Paper: 2506.15415v1	Benchmark Design: Targeted Lexical Injection (TLI) Architecture: The paper applies Targeted Lexical Injection (TLI) to Lugha-Llama-8B. This method uses LoRA to fine-tune embeddings specifically from Layer 2 (identified as the peak alignment layer) using a contrastive objecti...	03-12 21:26	Success	-	View
exp_pytrain.20260312212238.022_20260312_212341 Paper: pytrain.20260312212238.022	Modern Generic Cache with PEP 695 Syntax Overview This benchmark evaluates the implementation of a modern, type-safe in-memory cache using Python 3.12's PEP 695 Type Parameter Syntax. Features - PEP 695 Syntax: Utilizes the new `class MyClass[T]:` and `type MyAlias[T]...	03-12 21:23	Success	-	View
exp_2506.15569v1_20260312_212050 Paper: 2506.15569v1	SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification Paper: SciVer (Benchmark) Category: Evaluation & RAG Analysis Architecture: Focuses on Multimodal LLMs suitable for local inference, specifically Llama-3.2-Vision and Qwen2.5-VL. RAG & Retrieval Strategy: * **Arc...	03-12 21:20	Success	-	View
exp_2506.15655v2_20260312_211955 Paper: 2506.15655v2	cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree Architecture: cAST proposes a structure-aware preprocessing pipeline for Code RAG, replacing heuristic line-based splitting with Abstract Syntax Tree (AST) parsing. Retrieval & Chunking Strategy: * *Retrieval Architecture:...	03-12 21:19	Success	-	View
exp_2506.21596v2_20260312_211855 Paper: 2506.21596v2	Evaluating Multimodal Large Language Models on Educational Textbook Question Answering Architecture: Benchmarks LLaVA-1.5 and LLaMA 3.2-Vision (VLMs) on the CK12-QA dataset. Retrieval Architecture: Multimodal RAG pipeline providing lesson paragraphs and diagrams. (Indexing/chunking strategy and reranking methods are n...	03-12 21:18	Success	-	View
exp_pytrain.20260312211539.021_20260312_211617 Paper: pytrain.20260312211539.021	Generic Event Dispatcher with Protocol-Based Registration Overview This benchmark demonstrates an autonomous coding system designing an extensible, loosely-coupled architecture using Python's advanced typing features. Hypothesis: An autonomous coding system can effectively design extensible, l...	03-12 21:16	Success	-	View
exp_2506.15911v2_20260312_211239 Paper: 2506.15911v2	Tibbe-AG: Islamic Medicine Response Validation Benchmark Architecture: Evaluates 7B-class models (LLaMA-3, Mistral-7B, Qwen2-7B) within a multi-stage pipeline. The flow transitions from Retrieval-Augmented Generation (RAG) to a Scientific Self-Critique Agent, concluding with an **LLM-...	03-12 21:13	Success	-	View
exp_2506.16172v1_20260312_211058 Paper: 2506.16172v1	Benchmark: SGIC (Self-Guided Iterative Calibration) for RAG Architecture & RAG Strategy: SGIC introduces an iterative wrapper around standard RAG, utilizing an uncertainty estimator to perform reranking based on document relevance and LLM confidence. It employs a **multi-round calibratio...	03-12 21:11	Success	-	View
exp_2506.16411v2_20260312_210957 Paper: 2506.16411v2	This repository contains a synthetic benchmark to evaluate the "Noise Decomposition Framework" for Long Context LLMs... Architecture: Proposes a MapReduce-style "Multi-Agent Chunking" framework. It splits long inputs into fixed-size segments to minimize "Model Noise" (fidelity decay in long sequences) and aggregates partial results. Memory Footprint:...	03-12 21:10	Success	-	View
exp_pytrain.20260312210717.020_20260312_210806 Paper: pytrain.20260312210717.020	Type-Safe Dynamic Kernel Packager This benchmark demonstrates the creation of a simulated AI kernel plugin system. It bridges static type definitions using `typing.Protocol` with dynamic runtime module loading via `zipfile`. Overview The script performs the following operat...	03-12 21:08	Success	-	View
exp_cr_10.3390_a18030155_20260312_210516 Paper: cr_10.3390_a18030155	This benchmark evaluates the computational efficiency of the Text-Guided Synthesis framework for colonoscopy data augmen... Architecture: The framework employs Stable Diffusion fine-tuned with DreamBooth Low-Rank Adaptation (LoRA) for synthetic colonoscopy image generation. Downstream classification utilizes Vision Transformers (ViT) and **Effici...	03-12 21:05	Success	-	View
exp_cr_10.48175_ijarsct-25189_20260312_210423 Paper: cr_10.48175_ijarsct-25189	JobMatchr RAG Performance Benchmark Architecture & Retrieval: JobMatchr is a web-based RAG system built on Flask and LangChain. It employs a vector embedding retrieval architecture and depends on the proprietary Gemini-2.0-flash API for generation. * **RAG...	03-12 21:04	Success	-	View
exp_cr_10.2196_67677_20260312_210333 Paper: cr_10.2196_67677	Improving Dietary Supplement Information Retrieval: Development of a Retrieval-Augmented Generation System With Large La... Architecture: Knowledge Graph (KG) based RAG system utilizing a hybrid generator-retriever approach. The retrieval component extracts relevant subgraphs from the integrated Dietary Supplement Knowledgebase (iDISK2.0), containing 174k en...	03-12 21:03	Success	-	View
exp_2409.08597v1_20260312_210227 Paper: 2409.08597v1	LA-RAG:Enhancing LLM-based ASR Accuracy with Retrieval-Augmented Generation Architecture & RAG Design: LA-RAG is a specialized RAG framework for LLM-based ASR that utilizes speech-to-speech retrieval. Instead of text chunks, it indexes token-level speech datastores (acoustic embeddings). It retrieves si...	03-12 21:02	Success	-	View
exp_pytrain.20260312205933.019_20260312_210019 Paper: pytrain.20260312205933.019	Strictly Typed Plugin Registry Benchmark This benchmark tests the ability to implement a type-safe, dynamic plugin registry system using Python's standard library, mimicking patterns found in frameworks like Transformers or vLLM. It focuses on strict typing (`Protocol`, `Generic`,...	03-12 21:00	Success	-	View
exp_2409.08820v2_20260312_205755 Paper: 2409.08820v2	A RAG Approach for Generating Competency Questions in Ontology Engineering Summary: This paper validates a RAG workflow for generating Competency Questions (CQs) for ontology engineering from scientific papers. * Architecture: Uses GPT-4 as the generator. The retrieval component ingests scientific text to...	03-12 20:57	Success	-	View
exp_2409.09493v2_20260312_205701 Paper: 2409.09493v2	Pentest Copilot: LLM-Augmented Reasoning Benchmark Architecture: An agentic workflow ("Pentest Copilot") utilizing GPT-4-turbo with Chain of Thought (CoT) to automate penetration testing sub-tasks and interpret tool outputs. RAG & Retrieval: The abstract confirms RAG usage for hallu...	03-12 20:57	Success	-	View
exp_2409.10102v1_20260312_205606 Paper: 2409.10102v1	Trustworthiness in RAG: Lightweight Benchmark Summary: Type: Survey Paper (Review of Existing Techniques). Architecture/Memory/Speed: N/A. This paper does not propose a new model architecture, nor does it address memory footprint or inference speed optimizations. **RAG Spec...	03-12 20:56	Success	-	View
exp_2409.10173v3_20260312_205504 Paper: 2409.10173v3	Benchmark for jina-embeddings-v3 Architecture: A 570M parameter transformer utilizing task-specific Low-Rank Adaptation (LoRA) adapters to specialize embeddings for distinct objectives (retrieval, clustering, classification). Memory Footprint: Exceptionally efficie...	03-12 20:55	Success	-	View
exp_pytrain.20260312205305.018_20260312_205338 Paper: pytrain.20260312205305.018	Python Skill Fallback Title: Dynamic Plugin Loader with Runtime Type Verification - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 20:53	Success	-	View
exp_2409.15364v1_20260312_205145 Paper: 2409.15364v1	VERA: Validation and Enhancement for Retrieval Augmented systems Architecture: VERA wraps standard RAG with a dual-stage LLM validator. A "cum-enhancer" LLM pre-filters retrieved documents for relevance and redundancy, and a post-generator splits responses into atomic statements for fact-checking aga...	03-12 20:51	Success	-	View
exp_2409.12558v2_20260312_205101 Paper: 2409.12558v2	RAD-Bench: Evaluating Large Language Models Capabilities in Retrieval Augmented Dialogues Assessment: RAD-Bench is a benchmark framework for evaluating Search-Augmented Generation (SAG) and Retrieval-Augmented Generation (RAG) in multi-turn dialogues. It measures Retrieval Synthesis (aggregating info) and **Retri...	03-12 20:51	Success	-	View
exp_2409.12880v1_20260312_205007 Paper: 2409.12880v1	E-commerce Product Title Translation RAG Benchmark Architecture: Standard RAG pipeline coupling a dense retriever with a generative LLM. RAG Specifics: * Retrieval Architecture: Semantic search over a database of existing bilingual product titles. * Indexing: Stores "bilingu...	03-12 20:50	Success	-	View
exp_2409.13902v1_20260312_204900 Paper: 2409.13902v1	Ophthalmology RAG Benchmark Architecture: Domain-specific RAG pipeline utilizing a 70,000-document ophthalmology corpus to augment LLM inference. RAG Specifics: * Retrieval Strategy: Top-10 document retrieval (k=10). * Indexing/Chunking: Unspecified in...	03-12 20:49	Success	-	View
exp_pytrain.20260312204700.017_20260312_204727 Paper: pytrain.20260312204700.017	Runtime Type-Validated Dynamic Plugin Loader Overview This coding drill tests the integration of Python's dynamic module loading capabilities with the Structural Subtyping (Protocol) features introduced in recent Python versions. Goal Construct a self-contained runtime environment tha...	03-12 20:47	Success	-	View
exp_2409.19006v2_20260312_204534 Paper: 2409.19006v2	Towards Automated Patent Workflows: AI-Orchestrated Multi-Agent Framework for Intellectual Property Management and Analy... Architecture: PatExpert utilizes a multi-agent orchestration model comprising a meta-agent, task-specific expert agents, and critique agents (Gold/Reward-LLM-as-a-Judge). Retrieval (RAG): Employs **Graph Retrieval-Augmented Generati...	03-12 20:45	Success	-	View
exp_2409.14192v2_20260312_204429 Paper: 2409.14192v2	Benchmark: Knowledge in Triples for Table QA Architecture: A RAG framework that transforms semi-structured tables into (Subject, Predicate, Object) triples to feed the generator, bypassing the need for SQL/SPARQL parsing. Retrieval Strategy: * Indexing/Chunking: Data is ch...	03-12 20:44	Success	-	View
exp_2403.10798v2_20260312_204330 Paper: 2403.10798v2	Benchmarking Object Retrieval for Visual Question Answering (OR-OK-VQA) Architecture: Proposes OR-OK-VQA, a Visual RAG framework replacing global image retrieval with object-level retrieval. It employs Multi-scale Group Collaborative Embedding Learning (MS-GCEL) to generate unsupervised embeddin...	03-12 20:43	Success	-	View
exp_pytrain.20260312204031.016_20260312_204111 Paper: pytrain.20260312204031.016	Dynamic Package Loader with Protocol Enforcement This benchmark tests the ability to construct a robust runtime loader in Python using only the standard library. It simulates a micro-framework that dynamically generates a plugin architecture on the filesystem, loads these modules using `i...	03-12 20:41	Success	-	View
exp_cr_10.69987_jacs.2024.40306_20260312_203844 Paper: cr_10.69987_jacs.2024.40306	Semantic Verifier for Post-hoc Answer Validation in Chat Platforms: Claim Decomposition, Evidence Retrieval, NLI, and Tr... Architecture: Modular post-hoc verification pipeline consisting of Claim Decomposition, Evidence Retrieval, and NLI classification. Retrieval Strategy: Uses a "title-only evidence approximation." The system indexes Wikipedia page ti...	03-12 20:38	Success	-	View
exp_2403.14952v1_20260312_203735 Paper: 2403.14952v1	Evidence-Driven Retrieval Augmented Response Generation for Online Misinformation Architecture: RARG utilizes a two-stage pipeline: (1) Evidence Collection via retrieval and reranking from a corpus of 1M+ academic articles, and (2) Response Generation using an RLHF-aligned LLM tuned to maximize evidence utili...	03-12 20:37	Success	-	View
exp_cr_10.1609_aaai.v38i20.30590_20260312_203617 Paper: cr_10.1609_aaai.v38i20.30590	Select and Augment: Enhanced Dense Retrieval Knowledge Graph Augmentation (Abstract Reprint) Architecture: A dual-component framework combining a Knowledge Graph (KG) embedding model with a trainable dense Retriever. Unlike static augmentation, this model performs multi-task optimization to select and align KG entities with dyn...	03-12 20:36	Success	-	View
exp_pytrain.20260312203314.015_20260312_203351 Paper: pytrain.20260312203314.015	Python Skill Fallback Title: Generic CLI Toolkit with Type Parameter Syntax - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 20:33	Success	-	View
exp_cr_10.1609_aaai.v38i8.28717_20260312_203125 Paper: cr_10.1609_aaai.v38i8.28717	Learning to Rank in Generative Retrieval (LTRGR) Benchmark Architecture: LTRGR optimizes Generative Retrieval (typically T5-based Seq2Seq models) by introducing a Learning-to-Rank (LTR) training objective. It replaces standard maximum likelihood estimation with a ListWise rank loss,...	03-12 20:31	Success	-	View
exp_cr_10.1609_aaai.v38i16.29728_20260312_203030 Paper: cr_10.1609_aaai.v38i16.29728	This benchmark implements a lightweight, self-contained evaluation harness inspired by the RGB (Retrieval-Augmented Gene... Paper: Benchmarking Large Language Models in Retrieval-Augmented Generation (RGB) Type: RAG Evaluation & Robustness Analysis Summary: This paper introduces the RGB benchmark, isolating four critical RAG capabilities: noise r...	03-12 20:30	Success	-	View
exp_2403.16435v1_20260312_202944 Paper: 2403.16435v1	InstUPR Benchmark: Instruction-based Unsupervised Passage Reranking Architecture: Unsupervised reranker leveraging instruction-tuned LLMs via prompt engineering. Utilizes pairwise comparison and a novel soft score aggregation mechanism to rank passages without task-specific fine-tuning. **Retrie...	03-12 20:29	Success	-	View
exp_2403.17209v4_20260312_202855 Paper: 2403.17209v4	Benchmark: Asset Administration Shell (AAS) Generation via Semantic Nodes Architecture: Constructs a "semantic node" data structure to map raw technical datasheets into standardized Asset Administration Shells (AAS) for Industry 4.0. RAG Implementation: Utilizes Retrieval-Augmented Generation to ground te...	03-12 20:28	Success	-	View
exp_pytrain.20260312202620.014_20260312_202650 Paper: pytrain.20260312202620.014	Strictly Typed Config Package Builder This benchmark evaluates your ability to programmatically generate a valid Python package structure and implement a robust configuration system using modern standard packaging and static type safety features. Objective Create a single execu...	03-12 20:26	Success	-	View
exp_2309.07606v2_20260312_202532 Paper: 2309.07606v2	Zero-shot Audio Topic Reranking Benchmark Architecture: Dual-stage pipeline combining vector-based retrieval with zero-shot LLM reranking. Retrieval Strategy: Rapid search via video attribute embeddings. Reranking Method: Zero-shot LLM scoring to refine initial results....	03-12 20:25	Success	-	View
exp_2309.12767v1_20260312_202406 Paper: 2309.12767v1	Furthest Reasoning with Plan Assessment: Stable Reasoning Path with Retrieval-Augmented Large Language Models Architecture: An iterative RAG framework coupling a generator LLM with a distinct, trainable "Plan Assessor" module. RAG Specifics: * Architecture: Iterative Retrieval. * Strategy: Uses "Furthest Reasoning," where the LLM re...	03-12 20:24	Success	-	View
exp_2309.14805v1_20260312_202315 Paper: 2309.14805v1	Fine-tuning and aligning question answering models for complex information extraction tasks Architecture: Proposes a fine-tuned Extractive Question Answering (QA) architecture (specifically German encoder-based models) rather than generative LLMs. This approach focuses on span prediction to guarantee output grounding withi...	03-12 20:23	Success	-	View
exp_2309.15088v1_20260312_202150 Paper: 2309.15088v1	RankVicuna: Zero-Shot Listwise Document Reranking Benchmark RankVicuna adapts the Vicuna-7B LLM for zero-shot listwise document reranking, achieving performance comparable to GPT-3.5. * Architecture: Listwise permutation generation. It acts as a second-stage reranker, ingesting a query and r...	03-12 20:22	Success	-	View
exp_pytrain.20260312201913.013_20260312_201958 Paper: pytrain.20260312201913.013	Strictly-Typed Plugin Registry with Dynamic Dependency Loading Overview This benchmark evaluates a developer's ability to construct a framework-agnostic plugin architecture using Python's advanced type system and standard library introspection tools. Hypothesis An autonomous system can construct a robu...	03-12 20:20	Success	-	View
exp_2303.12024v3_20260312_201753 Paper: 2303.12024v3	Benchmark for cTBLS: Augmenting Large Language Models with Conversational Tables Architecture: cTBLS is a 3-stage RAG pipeline: (1) Dense Retrieval (Transformer encoders) for table selection, (2) Coarse+Fine Ranking (shared encoder-decoder) for cell selection, and (3) LLM Generation (paper uses GPT-3.5). **Retrieval...	03-12 20:17	Success	-	View
exp_2303.12501v1_20260312_201707 Paper: 2303.12501v1	Text-to-Image Person Retrieval Benchmark Paper: Cross-Modal Implicit Relation Reasoning and Aligning (IRRA) for Text-to-Image Person Retrieval Architecture & Retrieval Focus: IRRA proposes a cross-modal encoder architecture. Instead of treating modalities independently...	03-12 20:17	Success	-	View
exp_2304.00241v1_20260312_201624 Paper: 2304.00241v1	Benchmarking Bipartite Graph Convolutional Hashing (BGCH) Architecture: End-to-End Bipartite Graph Convolutional Network (GCN) that generates compact binary hash codes. It utilizes adaptive convolution and latent feature dispersion to preserve structural information during binarization. **Retr...	03-12 20:16	Success	-	View
exp_hf_2603.08075_20260312_201526 Paper: hf_2603.08075	TALON: Test-time Adaptive Learning for On-the-Fly Category Discovery Architecture: TALON replaces static hash-based quantization with a test-time adaptation framework featuring two core components: semantic-aware prototype updates (refining class representations) and stable test-time encoder updates (int...	03-12 20:15	Success	-	View
exp_pytrain.20260312201241.012_20260312_201316 Paper: pytrain.20260312201241.012	Typed Module Emulator with Semantic Versioning This benchmark evaluates the capability of a Python environment to construct a standalone, typed library module that simulates strict software packaging practices. Objective The candidate script, `benchmark.py`, must function as a self-cont...	03-12 20:13	Success	-	View
exp_hf_2603.10913_20260312_200041 Paper: hf_2603.10913	LLM2Vec-Gen: Generative Embeddings Benchmark Architecture: LLM2Vec-Gen utilizes a frozen LLM backbone augmented with trainable special tokens appended to the input. Training involves optimizing these tokens using the LLM’s own completions and distillation signals from an unsup...	03-12 20:11	Success	-	View
exp_2309.11049v2_20260312_195941 Paper: 2309.11049v2	Localize, Retrieve and Fuse: A Generalized Framework for Free-Form Question Answering over Tables Architecture: TAG-QA uses a three-stage pipeline: (1) Table-to-Graph conversion via Graph Neural Networks (GNN) to locate relevant cells; (2) External Retrieval fetching Wikipedia evidence; (3) Fusion Generator integrating b...	03-12 19:59	Success	-	View
exp_2512.12938v1_20260312_195815 Paper: 2512.12938v1	SPAR: Session-based Pipeline for Adaptive Retrieval Architecture: SPAR proposes a two-stage adaptive RAG framework. It replaces monolithic vector databases with a lightweight static Semantic Metadata Index coupled with dynamically generated, session-specific vector databases....	03-12 19:58	Success	-	View
exp_pytrain.20260312195537.011_20260312_195617 Paper: pytrain.20260312195537.011	Generic CLI Execution Engine with Type-Safe Decorators This benchmark demonstrates a robust, modular command-line interface system built entirely with the Python standard library. It leverages advanced typing features—specifically `typing.Protocol`, `typing.ParamSpec`, and `typing.Concatenate`—...	03-12 19:56	Success	-	View
exp_2506.13607v1_20260312_194437 Paper: 2506.13607v1	Tree-Based Text Retrieval via Hierarchical Clustering Architecture: Replaces standard vector search with a Hierarchical Clustering retrieval architecture. Indexing/Chunking: Uses a tree-based structure where document chunks are organized into hierarchical clusters based on sema...	03-12 19:54	Success	-	View
exp_cr_10.1609_aaai.v38i17.29947_20260312_194332 Paper: cr_10.1609_aaai.v38i17.29947	Fine-Grained Distillation for Long Document Retrieval Benchmark Architecture: FGD enhances standard dense bi-encoders (retrievers) via a specific training-stage distillation loss. It addresses the "granular-mismatch" in long documents by aligning global representations across multiple granularit...	03-12 19:43	Success	-	View
exp_cr_10.3390_math11122733_20260312_194233 Paper: cr_10.3390_math11122733	Benchmark: Automotive Domain Retrieval-Based QA Summary for ARES 8GB Roadmap This paper validates a domain-adaptive encoder-retriever for automotive QA using a BERT-base architecture fine-tuned via a pretraining-multitask framework. * Architecture & Retrieval: Standard **...	03-12 19:43	Success	-	View
exp_pytrain.20260312194036.010_20260312_194120 Paper: pytrain.20260312194036.010	Strictly Typed Asynchronous Package Architecture This benchmark evaluates a developer's ability to structure a formally typed Python package using `asyncio`, `typing.Generic`, and proper packaging markers (`py.typed`). The script dynamically generates the required package structure, verif...	03-12 19:41	Success	-	View
exp_hf_2603.09827_20260312_193938 Paper: hf_2603.09827	MA-EgoQA: Multi-Agent Egocentric Video QA Benchmark Architecture: EgoMAS proposes a RAG-style pipeline featuring a "shared memory" module to fuse multi-agent sensory data. It utilizes agent-wise dynamic retrieval, compressing video frames into feature embeddings via a vision encoder,...	03-12 19:39	Success	-	View
exp_oa_W4415233873_20260312_193841 Paper: oa_W4415233873	Healthcare RAG Performance Benchmark Architecture: This survey classifies RAG into Naive, Advanced, and Modular frameworks. For 8GB constraints, Naive RAG is the primary viable candidate for local inference, as it follows a linear "retrieve-then-read" pipeline. **RAG S...	03-12 19:38	Success	-	View
exp_oa_W4416955380_20260312_193759 Paper: oa_W4416955380	Evaluating Faithfulness in Agentic RAG Systems for e-Governance Applications Using LLM-Based Judging Frameworks Paper: Evaluating Faithfulness in Agentic RAG Systems for e-Governance Applications... Summary: This study proposes a modular Agentic RAG framework rather than a low-memory inference technique. It evaluates a hybrid retrieval ar...	03-12 19:38	Success	-	View
exp_2512.10942v2_20260312_193652 Paper: 2512.10942v2	VL-JEPA: Joint Embedding Predictive Architecture for Vision-language Architecture: Replaces autoregressive token generation with a Joint Embedding Predictive Architecture (JEPA). The model predicts continuous text embeddings via a vision encoder and predictor, utilizing a lightweight text decoder only wh...	03-12 19:36	Success	-	View
exp_2512.11614v2_20260312_193604 Paper: 2512.11614v2	Merlin-Arthur RAG Benchmarking Suite Architecture: Proposes a Merlin-Arthur (M/A) training protocol where a generator LLM ("Arthur") is trained using a helpful retriever ("Merlin") and an adversarial retriever ("Morgana"). RAG Specifications: * Retrieval: Utilizes...	03-12 19:36	Success	-	View
exp_pytrain.20260312193408.009_20260312_193437 Paper: pytrain.20260312193408.009	Dynamic Plugin Loader with Strict Protocol Typing This benchmark tests the ability to construct a modular plugin architecture using Python's advanced `typing` features and the `importlib` system. Overview The script programmatically creates a temporary package structure (`mock_package/`) c...	03-12 19:34	Success	-	View
exp_2512.11997v1_20260312_193214 Paper: 2512.11997v1	Benchmark: EnrichLog - Knowledge-Enriched Log Anomaly Detection Architecture: EnrichLog is a training-free, entry-based anomaly detection framework utilizing a RAG pipeline to fuse raw logs with external knowledge. Retrieval & Context Strategy: * Architecture: Vector-based retrieval (dense e...	03-12 19:32	Success	-	View
exp_2512.12694v1_20260312_193127 Paper: 2512.12694v1	Hybrid RAG Benchmark Architecture: Modular multilingual RAG pipeline utilizing Hybrid Retrieval to handle noisy OCR data. It combines semantic query expansion and multi-query fusion, aggregated via Reciprocal Rank Fusion (RRF) to stabilize recall ag...	03-12 19:31	Success	-	View
exp_2602.22219v1_20260312_193031 Paper: 2602.22219v1	Comparative Analysis of Neural Retriever-Reranker Pipelines for Retrieval-Augmented Generation over Knowledge Graphs in... Paper: Comparative Analysis of Neural Retriever-Reranker Pipelines for Retrieval-Augmented Generation over Knowledge Graphs in E-commerce Applications Summary for ARES 8GB Roadmap: This study evaluates Retriever-Reranker pipelines f...	03-12 19:30	Success	-	View
exp_2512.13632v1_20260312_192937 Paper: 2512.13632v1	StutterFuse: Performance Benchmark Architecture: StutterFuse is a Retrieval-Augmented Classifier (RAC) combining a Conformer encoder with a Gated Mixture-of-Experts (MoE). It conditions acoustic features on a non-parametric memory bank of clinical example...	03-12 19:29	Success	-	View
exp_pytrain.20260312192727.008_20260312_192800 Paper: pytrain.20260312192727.008	Robust Namespace Package Loader with Structural Typing This benchmark evaluates your ability to construct a scalable plugin architecture using modern Python typing features (PEP 544 Protocols) and the standard library's import system (`importlib`, `pkgutil`). Objective Implement a `PluginLoader...	03-12 19:28	Success	-	View
exp_2512.14313v1_20260312_192550 Paper: 2512.14313v1	Dynamic Context Selection for Retrieval-Augmented Generation: Mitigating Distractors and Positional Bias Architecture & Retrieval Strategy This paper replaces standard fixed top-$k$ retrieval with a dynamic context selection mechanism. The architecture introduces a lightweight context-size classifier (likely a BERT-style model) tha...	03-12 19:25	Success	-	View
exp_cr_10.55606_jurritek.v4i3.6664_20260312_192458 Paper: cr_10.55606_jurritek.v4i3.6664	This repository contains the benchmarking code for the UCIC Academic Service Chatbot based on the Retrieval-Augmented Ge... Paper: Chatbot Layanan Akademik Calon Mahasiswa UCIC Menggunakan Metode RAG Summary for 8GB Roadmap: * Architecture: Standard Retrieval-Augmented Generation (RAG) pipeline orchestrated via LangChain. * Retrieval: FAISS (...	03-12 19:25	Success	-	View
exp_cr_10.37432_jieph-confpro5-00265_20260312_192430 Paper: cr_10.37432_jieph-confpro5-00265	Enhancing Lassa fever health literacy through AI: Development and evaluation of a retrieval-augmented generation chatbot... Architecture: Standard Retrieval-Augmented Generation (RAG) chatbot. Retrieval Strategy: Curated static documents (WHO, NCDC). *Specific indexing, chunking strategy, vector database, and reranking methods are not specified in the pr...	03-12 19:24	Success	-	View
exp_2506.12483v1_20260312_192353 Paper: 2506.12483v1	MALM: A Multi-Information Adapter for Large Language Models to Mitigate Hallucination Architecture: MALM introduces a parameter-efficient adapter utilizing a multilayered Graph Attention Network (GAT). It explicitly models the interdependencies between the original input, retrieved context, and parametric knowledge t...	03-12 19:23	Success	-	View
exp_2506.14035v1_20260312_192313 Paper: 2506.14035v1	SimpleDoc Benchmark Architecture: Agentic multi-modal RAG framework utilizing a Vision Language Model (VLM) for both embedding and final reasoning. Retrieval Architecture & Strategy: * Indexing/Chunking: Pages are indexed as visual chunks using VLM...	03-12 19:23	Success	-	View
exp_pytrain.20260312192112.007_20260312_192135 Paper: pytrain.20260312192112.007	Typed Component Registry System This project implements a robust, type-safe component registry pattern using Python's `typing` module. It demonstrates how to build a plugin architecture where the compiler and runtime enforce strict interface compliance, reducing attribute...	03-12 19:21	Success	-	View
exp_2506.15001v1_20260312_191004 Paper: 2506.15001v1	Memory Token Benchmark Architecture: Introduces "Memory Tokens"—single, optimized embedding vectors that act as lossless, compressed keys. When prompted with this token, the LLM reconstructs the original text sequence (up to ~240 tokens) exactly without weigh...	03-12 19:20	Success	-	View
exp_2506.16035v2_20260312_190911 Paper: 2506.16035v2	Vision-Guided Chunking Benchmark Architecture: Multimodal RAG utilizing Large Multimodal Models (LMMs) for document parsing instead of traditional text extractors. Retrieval & Chunking: Vision-Guided Chunking. The strategy processes PDFs in **configurable page...	03-12 19:09	Success	-	View
exp_pytrain.20260312190702.006_20260312_190728 Paper: pytrain.20260312190702.006	Runtime Type-Checked Plugin Loader This benchmark demonstrates a robust, autonomous plugin architecture using Python's standard library. The system simulates a multi-module package hierarchy entirely in-memory using `types` and `importlib`, bypassing the need for physical fi...	03-12 19:07	Success	-	View
exp_2506.16037v1_20260312_185540 Paper: 2506.16037v1	Multi-Hop RAG Benchmark for LLaMA 3 Architecture: LLaMA 3 enhanced with a Dense Retrieval Module and multi-hop reasoning chains for complex, long-document QA. RAG Specifics: * Retrieval Architecture: Dense Retrieval. * Optimization/Reranking: Uses **Jo...	03-12 19:05	Success	-	View
exp_pytrain.20260312185359.005_20260312_185418 Paper: pytrain.20260312185359.005	Generic Service Registry & Dispatcher Benchmark This benchmark evaluates the implementation of a robust, type-safe Service Registry using Python's standard `typing` module. The focus is on structural subtyping via `Protocol`, generics via `TypeVar`, and simulating proper packaging conven...	03-12 18:54	Success	-	View
exp_cr_10.3390_ai6030050_20260312_185245 Paper: cr_10.3390_ai6030050	Benchmark: Multimodal RAG for Eurobarometer Data Architecture: Modular framework integrating Retrieval-Augmented Generation (RAG) with Multimodal Large Language Models (MLLMs) to process Eurobarometer surveys (text + charts/images). RAG Specifics: * Retrieval Architecture: Mul...	03-12 18:52	Success	-	View
exp_cr_10.71070_oaml.v5i1.141_20260312_185203 Paper: cr_10.71070_oaml.v5i1.141	Retrieval-augmented generation for personalized physician recommendations in online medical services: model development... Architecture: Standard dense RAG. The system uses embedding-based retrieval to match patient queries against a database of consultation records and physician profiles, followed by an LLM synthesizing the recommendation. Retrieval: E...	03-12 18:52	Success	-	View
exp_oa_W4404390755_20260312_185120 Paper: oa_W4404390755	LEGO-GraphRAG Benchmark Architecture: LEGO-GraphRAG decomposes the GraphRAG pipeline into four modular stages: Query Understanding, Retrieval, Subgraph Construction, and Response Synthesis. RAG Specifics: * Retrieval Architecture: Modul...	03-12 18:51	Success	-	View
exp_2409.08479v2_20260312_185042 Paper: 2409.08479v2	Exploring Information Retrieval Landscapes: An Investigation of a Novel Evaluation Techniques and Comparative Document S... Assessment for ARES 8GB Roadmap This paper focuses on optimizing RAG preprocessing pipelines rather than core inference architecture or VRAM management. Retrieval & Chunking: * Architecture: The study evaluates a standard RAG pi...	03-12 18:50	Success	-	View
exp_2409.09281v2_20260312_184944 Paper: 2409.09281v2	Benchmark: Language Models "Grok" to Copy This paper is a theoretical study of Transformer internal dynamics, specifically regarding the formation of Induction Heads—the attention mechanism responsible for copying context, a prerequisite for In-Context Learning (ICL) an...	03-12 18:49	Success	-	View
exp_pytrain.20260312184753.004_20260312_184813 Paper: pytrain.20260312184753.004	Robust Plugin Loader with Runtime Type Checking Difficulty: Intermediate Focus: Dynamic Packaging, Structural Typing (`typing.Protocol`), `importlib` Time Limit: 20 Seconds Objective Implement a self-contained Python benchmark that simulates a plugin architecture. The system...	03-12 18:48	Success	-	View
exp_2409.10955v2_20260312_184722 Paper: 2409.10955v2	Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style Verdict: Low-priority architectural integration, high-priority retrieval pipeline optimization. Research Focus: This is an empirical analysis of RAG behaviors rather than a new model architecture. It investigates how **Memory Streng...	03-12 18:47	Success	-	View
exp_2409.11242v4_20260312_184603 Paper: 2409.11242v4	Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse Architecture & Memory: Trust-Align is an alignment strategy designed for small, open-weight models (LLaMA 1-8B, Qwen 0.5-7B, Phi-3.5). It focuses on "Grounded Attributions" and "Learning to Refuse," ensuring outputs strictly adhere to r...	03-12 18:46	Success	-	View
exp_2409.12812v3_20260312_184511 Paper: 2409.12812v3	CoDrivingLLM Benchmark Architecture: CoDrivingLLM utilizes a modular design separating semantic reasoning from physics. An Environment Module handles mathematical updates (vehicle kinematics), while a CoT-based Reasoning Module manages state perceptio...	03-12 18:45	Success	-	View
exp_2409.13682v1_20260312_184426 Paper: 2409.13682v1	ReMEmbR Benchmark: Long-Horizon Memory Retrieval Architecture: ReMEmbR is a retrieval-augmented framework utilizing a dual-phase structure: a memory building phase and a querying phase. It uses a Vision-Language Model (VLM) to encode video frames and metadata into a memory bank, rathe...	03-12 18:44	Success	-	View
exp_2409.13992v1_20260312_184330 Paper: 2409.13992v1	SMART-RAG: Context Selection Benchmark Architecture: SMART-RAG replaces standard top-k selection with Determinantal Point Processes (DPPs) to optimize for both relevance and diversity. Retrieval & Budget: Utilizes a Retrieve-then-Select strategy. It retrieves a l...	03-12 18:43	Success	-	View
exp_pytrain.20260312184117.003_20260312_184144 Paper: pytrain.20260312184117.003	Type-Safe Dynamic Package Registry Benchmark This benchmark tests the robustness of a dynamic Python plugin system. It simulates an environment where functionality is extended at runtime by loading modules from the filesystem. The core challenge is to ensure that these dynamically loa...	03-12 18:41	Success	-	View
exp_oa_W4399511665_20260312_183957 Paper: oa_W4399511665	Multi-Head RAG: Solving Multi-Aspect Problems with LLMs Architecture: MRAG modifies the Retriever Component by using the activations of each Transformer attention head as distinct retrieval keys, rather than a single aggregated embedding vector. Retrieval & Indexing: It utilizes a **mu...	03-12 18:40	Success	-	View
exp_2403.14197v1_20260312_183843 Paper: 2403.14197v1	Context Quality Matters in Training Fusion-in-Decoder for Extractive Open-Domain Question Answering Architecture: Fusion-in-Decoder (FiD). Retrieval Strategy: Passage-level retrieval with multi-context concatenation. Memory Footprint: Critical Constraint. FiD encodes all retrieved passages simultaneously in the encoder. Th...	03-12 18:38	Success	-	View
exp_2403.14374v1_20260312_183727 Paper: 2403.14374v1	FIT-RAG: Black-Box RAG Benchmark Architecture: FIT-RAG optimizes black-box RAG using a Bi-label Document Scorer (aligns retrieval with factual relevance rather than LLM preference), a Self-knowledge Recognizer (bypasses retrieval if the frozen LLM knows the ans...	03-12 18:38	Success	-	View
exp_2403.15268v5_20260312_183652 Paper: 2403.15268v5	Awakening Augmented Generation (AAG) Benchmark Architecture: A non-retrieval framework designed to activate internal knowledge. It employs a Context Generator to synthesize a compressed "symbolic" document and a Hypernetwork to generate dynamic, query-specific adapters. These adapte...	03-12 18:36	Success	-	View
exp_pytrain.20260312183421.002_20260312_183454 Paper: pytrain.20260312183421.002	Dynamic Module Loader with PEP 695 Syntax This benchmark tests the ability to implement a generic wrapper for runtime module loading using Python 3.12+'s Type Parameter Syntax (PEP 695). Objective Create a script `dynamic_loader.py` that implements a generic class `ModuleLoader[T]`...	03-12 18:34	Success	-	View
exp_2404.07221v2_20260312_182309 Paper: 2404.07221v2	Benchmark: RAG Retrieval Enhancement on Financial Documents This paper proposes a modular RAG optimization pipeline focused on financial document QA, aiming to fix retrieval errors rather than LLM limitations. * Architecture: Standard RAG with dense vector retrieval. * Chunking: "Sophisticat...	03-12 18:33	Success	-	View
exp_2403.15729v3_20260312_182222 Paper: 2403.15729v3	RAGS4EIC Summarization Benchmark RAGS4EIC proposes a RAG-based agent for managing complex scientific documentation using a modular LangChain workflow. * Architecture: A two-stage pipeline: a comprehensive Vector Database for semantic retrieval and an LLM fo...	03-12 18:22	Success	-	View
exp_cr_10.1609_aaai.v38i21.30577_20260312_182149 Paper: cr_10.1609_aaai.v38i21.30577	GEAR-Up: Generative AI and External Knowledge-Based Retrieval: Upgrading Scholarly Article Searches for Systematic Revie... Architecture: KG-augmented query expansion pipeline. The system retrieves semantic context from a Knowledge Graph (KG) to enrich user queries before passing them to an LLM for translation and refinement. Retrieval Strategy: * **Retr...	03-12 18:21	Success	-	View
exp_2309.13375v2_20260312_182119 Paper: 2309.13375v2	Benchmark: Generative Retrieval with SEATER (Semantic Tree-Structured IDs) Paper: SEATER (SEmAntic Tree-structured item identifiERs) Architecture: An Encoder-Decoder Generative Retrieval framework optimized for large-scale recommendations. It replaces traditional vector similarity search with autoregre...	03-12 18:21	Success	-	View
exp_2309.15217v2_20260312_182034 Paper: 2309.15217v2	Ragas: Automated Evaluation of Retrieval Augmented Generation Subject: Ragas (Automated RAG Evaluation Framework) Architecture: An LLM-as-a-Judge framework. It utilizes prompt engineering to guide an LLM to score specific dimensions—Context Precision (retrieval quality), Faithfulness (...	03-12 18:20	Success	-	View
exp_pytrain.20260312181849.001_20260312_181909 Paper: pytrain.20260312181849.001	Dynamic Plugin Registry with Runtime Type Validation This drill verifies the ability to design a robust, extensible plugin system using Python's standard library. Candidates must demonstrate proficiency with `typing.Protocol` for structural subtyping, `importlib` for dynamic code loading, and...	03-12 18:19	Success	-	View
exp_pytrain.20260312140657.027_20260312_140723 Paper: pytrain.20260312140657.027	Dynamic Type-Safe Plugin Loader with Runtime Validation README.md Dynamic Type-Safe Plugin Loader with Runtime Validation Overview This benchmark demonstrates a robust, autonomous system for loading Python plugins dynamically from a simulated package distribution. It enforces strict type safety...	03-12 14:09	Success	-	View
exp_oa_W7114889968_20260312_140058 Paper: oa_W7114889968	RAG vs. Parametric Performance Benchmark Paper Type: Systematic Literature Review (SLR) Analysis Scope: Synthesis of 128 studies (Jan 2020–May 2025) on Retrieval-Augmented Generation (RAG). Architecture & Feasibility: N/A (Survey Paper). This paper does not propose a s...	03-12 14:05	Success	-	View
exp_pytrain.20260312135459.026_20260312_135525 Paper: pytrain.20260312135459.026	Strict Configuration Dispatcher Benchmark README.md Strict Configuration Dispatcher Benchmark Objective This benchmark evaluates an autonomous agent's ability to implement a "Configuration-to-Instance" dispatcher, a core pattern in high-performance machine learning frameworks (e.g....	03-12 13:56	Success	-	View
exp_2512.12935v1_20260312_135132 Paper: 2512.12935v1	Unified Interactive Multimodal Moment Retrieval - Benchmark Paper: Unified Interactive Multimodal Moment Retrieval via Cascaded Embedding-Reranking and Temporal-Aware Score Fusion Summary: Retrieval Architecture: A cascaded dual-encoder system using BEIT-3 and SigLIP for broad ca...	03-12 13:52	Success	-	View
exp_2512.14554v4_20260312_134845 Paper: 2512.14554v4	VLegal-Bench: Cognitively Grounded Benchmark for Vietnamese Legal Reasoning of Large Language Models This paper introduces VLegal-Bench, a benchmark rather than a novel model architecture, designed to evaluate LLMs on Vietnamese legal reasoning using 10,450 expert-validated samples. * Architecture & Feasibility: The benchmark f...	03-12 13:49	Success	-	View
exp_pytrain.20260312134313.025_20260312_134349 Paper: pytrain.20260312134313.025	Typed CLI Dispatcher & Entry-Point Simulation README.md Typed CLI Dispatcher & Entry-Point Simulation This benchmark demonstrates an advanced understanding of Python's type system and software architecture patterns, specifically focusing on creating a modular, extensible CLI framework...	03-12 13:44	Success	-	View
exp_cr_10.5334_uproc.170_20260312_133920 Paper: cr_10.5334_uproc.170	Smart Decision-Making: The Role of Digital Twins, Retrieval-Augmented Generation-Enhanced AI, and Learning Analytics Architecture: Proposes a macro-architecture integrating Learning Analytics (data mining), Digital Twins (simulation), and RAG-enhanced LLMs (synthesis) for higher-ed management. RAG Specifics: Missing Technical Specs. The abstra...	03-12 13:40	Success	-	View
exp_pytrain.20260312133517.024_20260312_133557 Paper: pytrain.20260312133517.024	Strict Package Metadata and Typing Inspector Benchmark README.md Strict Package Metadata and Typing Inspector Benchmark Overview This benchmark evaluates a system's ability to generate a Python CLI tool that performs static analysis on a codebase. The tool, `pkg_inspector.py`, must verify packa...	03-12 13:36	Success	-	View
exp_cr_10.3390_ai6090226_20260312_133329 Paper: cr_10.3390_ai6090226	Section 1: README.md Type: Systematic Literature Review (SLR). Architecture: Synthesizes Naïve, Advanced, and Modular RAG architectures for clinical applications (diagnostics, EHR summarization, QA). RAG Specifics: As a survey, it aggreg...	03-12 13:34	Success	-	View
exp_pytrain.20260312132648.023_20260312_132722 Paper: pytrain.20260312132648.023	Dynamic Plugin Loader with Structural Subtyping This benchmark demonstrates a robust, zero-dependency plugin architecture using Python's standard library. Objective To simulate an autonomous coding system capable of: 1. Defining Strict Interfaces: Using `typing.Protocol` to enforce s...	03-12 13:28	Success	-	View
exp_2506.13026v1_20260312_132337 Paper: 2506.13026v1	Knowledge Graph Fusion with Large Language Models for Accurate, Explainable Manufacturing Process Planning Architecture: ARKNESS is a GraphRAG framework fusing zero-shot Knowledge Graph (KG) construction with on-premise LLMs for CNC process planning. Retrieval Strategy: * Indexing: Converts heterogeneous documents into multi-relation...	03-12 13:24	Success	-	View
exp_2506.15862v1_20260312_132201 Paper: 2506.15862v1	Here is the design for the MoR (Mixture of Retrievers) benchmark. Architecture & Memory MoR proposes a lightweight gating network (0.8B parameters) to dynamically fuse outputs from heterogeneous retrievers. The architecture combines BM25 (Sparse), Dense Embeddings (Semantic), and specialized Human ret...	03-12 13:22	Success	-	View
exp_pytrain.20260312131840.022_20260312_131907 Paper: pytrain.20260312131840.022	--- README.md --- Generic Plugin System Benchmark (PEP 695) Overview This benchmark evaluates the implementation of a Generic Plugin System using modern Python 3.12+ features. It specifically validates the usage of **PEP 695 Type Parameter...	03-12 13:19	Success	-	View
exp_cr_10.3897_biss.8.136735_20260312_131443 Paper: cr_10.3897_biss.8.136735	Benchmark: LLM-Based Biodiversity Information Extraction Summary for ARES 8GB Roadmap Objective: Automate the extraction of deep learning metadata (datasets, metrics, hyperparameters) from biodiversity literature to replace manual annotation. RAG & Architecture: * Base Model: Mixt...	03-12 13:16	Success	-	View
exp_pytrain.20260312131038.021_20260312_131136 Paper: pytrain.20260312131038.021	Python Skill Fallback Title: Robust Dependency Graph Resolver using Structural Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 13:11	Success	-	View
exp_2409.11190v2_20260312_130720 Paper: 2409.11190v2	SuperCoder2.0: Architecture Benchmark Architecture & RAG: SuperCoder2.0 utilizes a multi-agent architecture with a three-step hierarchical RAG pipeline. 1. Retrieval: Uses a Repository File Level Map to identify candidate files. 2. Chunking/Indexing: Refines...	03-12 13:08	Success	-	View
exp_2409.12468v3_20260312_130537 Paper: 2409.12468v3	Familiarity-Aware Evidence Compression (FaviComp) Benchmark Paper: FaviComp (Familiarity-Aware Evidence Compression) Architecture: FaviComp is a training-free compression module designed to sit between the retriever and the generator in a RAG pipeline. It utilizes the target generator’s...	03-12 13:06	Success	-	View
exp_pytrain.20260312130336.020_20260312_130353 Paper: pytrain.20260312130336.020	Dynamic Plugin Loader with Type-Safe Contracts Benchmark README.md Dynamic Plugin Loader with Type-Safe Contracts Benchmark This benchmark evaluates a Python system's ability to dynamically load code at runtime while strictly enforcing interface compliance using `typing.Protocol`. Objective The g...	03-12 13:04	Success	-	View
exp_2409.12682v2_20260312_130030 Paper: 2409.12682v2	Here is the runnable benchmark for the "Retrieval-Augmented Test Generation" innovation. Summary for ARES 8GB Roadmap Architecture & RAG Strategy: The paper evaluates a Basic RAG pipeline against a domain-specific API-level RAG approach. The retrieval architecture pulls from three external sources: API documenta...	03-12 13:01	Success	-	View
exp_2409.14175v2_20260312_125902 Paper: 2409.14175v2	QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option Shuffling Architecture: Fine-tunes efficient Small Language Models (SLMs), specifically Phi-2 (2.7B) and Falcon-7B, within a RAG framework. Introduces Question-Masked Loss (masking query tokens to force context-to-option alignment) an...	03-12 12:59	Success	-	View
exp_pytrain.20260312125609.019_20260312_125643 Paper: pytrain.20260312125609.019	Type-Safe Plugin Registry and Configuration Loader Benchmark README.md Type-Safe Plugin Registry and Configuration Loader Benchmark Overview This benchmark evaluates the capability of an autonomous coding system to implement core architectural patterns found in large-scale machine learning frameworks...	03-12 12:56	Success	-	View
exp_2403.17759v1_20260312_125445 Paper: 2403.17759v1	TWOLAR: a TWO-step LLM-Augmented distillation method for passage Reranking Architecture: A two-step distillation pipeline training a lightweight BERT-based Cross-Encoder student to mimic the zero-shot reranking capabilities of a large LLM teacher. RAG & Retrieval Strategy: * Retrieval: Agnostic to...	03-12 12:55	Success	-	View
exp_2512.12980v2_20260312_125254 Paper: 2512.12980v2	Benchmark: Iceberg - Task-Centric Vector Similarity Search This paper introduces Iceberg, a benchmark suite evaluating Vector Similarity Search (VSS) architectures based on downstream task utility rather than isolated recall-latency metrics. Retrieval Architecture: Focuses on **Approxim...	03-12 12:53	Success	-	View
exp_2512.13771v1_20260312_125045 Paper: 2512.13771v1	Here is the design for the Semantic Grounding Index (SGI) benchmark. Architecture: Introduces the Semantic Grounding Index (SGI), a geometric post-hoc detector analyzing angular distances on a hypersphere ($\mathbb{S}^{d-1}$). It identifies "semantic laziness" where responses remain proximate to question...	03-12 12:51	Success	-	View
exp_pytrain.20260312124840.018_20260312_124909 Paper: pytrain.20260312124840.018	```markdown README.md bash python benchmark.py	03-12 12:49	Success	-	View
exp_cr_10.63887_jtie.2025.1.3.3_20260312_124710 Paper: cr_10.63887_jtie.2025.1.3.3	Benchmark: LLM-RAG Patent Retrieval System Architecture & Retrieval The paper proposes a cloud-centric RAG framework utilizing `gpt-3.5-turbo` for generation and an unspecified "high-efficiency vector retrieval engine" for semantic search. The pipeline consists of data preproces...	03-12 12:47	Success	-	View
exp_oa_W4409588626_20260312_124607 Paper: oa_W4409588626	Benchmark: Mamba-GraphRAG for Medical Reasoning Architecture & Retrieval: This paper proposes a hybrid GraphRAG system using a Neo4j knowledge graph (storing UMLS entities) combined with a dense vector store (textbook embeddings). The retrieval architecture is dual-layered: it perfor...	03-12 12:46	Success	-	View
exp_oa_W4410082953_20260312_124514 Paper: oa_W4410082953	Investigation: Evidence-Based GraphRAG for USMLE Questions Architecture: Hybrid GraphRAG utilizing Neo4j for symbolic reasoning (UMLS entities) and a vector store for semantic search (textbook embeddings). Retrieval: Dual-strategy indexing: graph-based entity mapping and dense retrieval...	03-12 12:45	Success	-	View
exp_2506.16444v2_20260312_124429 Paper: 2506.16444v2	REIS: In-Storage Processing Retrieval Benchmark Architecture: REIS proposes an In-Storage Processing (ISP) architecture that offloads Approximate Nearest Neighbor (ANNS) retrieval computations directly to the SSD controller, minimizing data movement between storage and host. **RAG Sp...	03-12 12:44	Success	-	View
exp_cr_10.3390_app14177995_20260312_124342 Paper: cr_10.3390_app14177995	Here is the benchmark design for the Personalized RAG System. Architecture & Retrieval Strategy This paper implements a standard RAG pipeline using hybrid retrieval. It combines semantic search via `text-embedding-ada-002` with keyword tagging to organize documents into **context-based cat...	03-12 12:43	Success	-	View
exp_pytrain.20260312124150.017_20260312_124217 Paper: pytrain.20260312124150.017	--- README.md	03-12 12:42	Success	-	View
exp_2409.09916v1_20260312_124107 Paper: 2409.09916v1	SFR-RAG Benchmark Suite Architecture & RAG Design SFR-RAG-9B is a dense, instruction-tuned decoder-only model optimized specifically for the Reader/Generator component of RAG. It does not define a specific internal retrieval architecture but is engineered...	03-12 12:41	Success	-	View
exp_2309.10966v6_20260312_123950 Paper: 2309.10966v6	MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods Architecture: Standard Transformer encoder-decoder. The authors propose "MBR Finetuning" and "QE Finetuning," training strategies that distill the knowledge of expensive decoding methods (Minimum Bayes' Risk decoding and Quality Estimat...	03-12 12:39	Success	-	View
exp_2512.10787v2_20260312_123829 Paper: 2512.10787v2	SEAL-RAG Benchmark Architecture: SEAL-RAG is a training-free controller wrapping standard RAG components. It executes a Search $\rightarrow$ Extract $\rightarrow$ Assess $\rightarrow$ Loop cycle to perform multi-hop reasoning without expanding the con...	03-12 12:38	Success	-	View
exp_cr_10.1609_aaai.v37i4.25598_20260312_123709 Paper: cr_10.1609_aaai.v37i4.25598	ConTextual Masked Auto-Encoder (CoT-MAE) Benchmark Architecture & Memory: CoT-MAE utilizes an asymmetric encoder-decoder for pre-training but deploys only the encoder for inference. This structure is optimized to compress sentence semantics into dense vectors. Memory footprint i...	03-12 12:37	Success	-	View
exp_pytrain.20260312123508.016_20260312_123543 Paper: pytrain.20260312123508.016	Strictly Typed Plugin Registry Benchmark README.md Strictly Typed Plugin Registry Benchmark This drill verifies the use of Python's `typing.Protocol` and `typing.Generic` to build a robust, loosely-coupled system suitable for a distributable library. Objective Candidates must impl...	03-12 12:35	Success	-	View
exp_oa_W4415560266_20260312_123343 Paper: oa_W4415560266	This benchmark evaluates the performance impact of the proposed MCP-aware Re-ranking mechanism integrated into a Ret... Architecture: Hybrid multi-agent system utilizing RAG, an Agent Communication Protocol (ACP) for orchestration, and a Model Context Protocol (MCP) for context fusion. Retrieval & Indexing: Python prototype using a vector store for t...	03-12 12:33	Success	-	View
exp_oa_W4416430905_20260312_123308 Paper: oa_W4416430905	RAGSmith: A Framework for Finding the Optimal Composition of Retrieval-Augmented Generation Methods Across Datasets RAGSmith employs a genetic search to optimize RAG pipelines over 46,080 configurations. * Architecture: The study identifies Vector Retrieval + Post-Generation Reflection/Revision as the optimal backbone. Passage compression...	03-12 12:33	Success	-	View
exp_oa_W4416075695_20260312_123223 Paper: oa_W4416075695	Benchmark: Retrieval-Augmented Generation (RAG) Performance Architecture: Hybrid "retrieve-then-generate" framework combining parametric LLMs with external, non-parametric knowledge retrieval. RAG Specifics: As a comprehensive review, this paper outlines the general paradigm rather than a si...	03-12 12:32	Success	-	View
exp_2512.10393v2_20260312_123140 Paper: 2512.10393v2	BinSeek: Cross-Modal Retrieval for Stripped Binary Analysis Architecture: BinSeek implements a two-stage retrieval pipeline: a dual-encoder (BinSeek-Embedding) for efficient high-recall retrieval, followed by a cross-encoder (BinSeek-Reranker) for context-aware refinement. **Retrieva...	03-12 12:31	Success	-	View
exp_2512.10422v3_20260312_123035 Paper: 2512.10422v3	Cooperative RAG (CoopRAG) Benchmark Architecture: CoopRAG utilizes a dual-component system featuring a dense retriever and an LLM that iteratively exchange states. The retriever employs a "Contrasting Layers" mechanism to rank documents by comparing representations from e...	03-12 12:30	Success	-	View
exp_pytrain.20260312122836.015_20260312_122907 Paper: pytrain.20260312122836.015	```markdown README.md	03-12 12:29	Success	-	View
exp_2512.12458v2_20260312_122715 Paper: 2512.12458v2	Benchmark Design: Stability of Multi-Vector vs. Single-Vector Retrieval Architecture: Theoretical analysis of Multi-vector (ColBERT-style), Filtered, and Sparse retrieval systems. Key Findings: * Multi-vector: Proves Chamfer distance preserves stability, while average pooling fails....	03-12 12:27	Success	-	View
exp_oa_W4417313874_20260312_122609 Paper: oa_W4417313874	Biomedical RAG Trilemma Benchmark Summary for ARES 8GB Roadmap: This survey (2020–2025) classifies biomedical RAG into naive, advanced, and modular architectures, formalizing the "Biomedical RAG Trilemma" (trade-offs between reasoning depth, inference latenc...	03-12 12:26	Success	-	View
exp_2512.13072v1_20260312_122527 Paper: 2512.13072v1	Benchmark: Retrieval-Guided Continual Learning (RG-CL) for Medical VLMs Architecture: Multimodal VLM framework integrating dynamic knowledge distillation with a multi-modal, multi-layer RAG system for Continual Learning (CL). RAG Strategy: Retrieves from a massive **18-million record PubMed-derived...	03-12 12:25	Success	-	View
exp_2512.14465v2_20260312_122452 Paper: 2512.14465v2	Context-Picker: Dynamic Context Selection Benchmark Architecture: Replaces static Top-K retrieval with a two-stage Reinforcement Learning (RL) policy. It first maximizes recall of critical passages, then prunes redundancy to distill a minimal sufficient evidence set. **RAG Specifics:...	03-12 12:24	Success	-	View
exp_2506.12981v2_20260312_122409 Paper: 2506.12981v2	SymRAG: Neuro-Symbolic Retrieval Benchmark Architecture: SymRAG introduces a neuro-symbolic RAG framework centered on an adaptive query router. This router assesses real-time query complexity and system load to dynamically dispatch requests to symbolic (rule-based), neural (...	03-12 12:24	Success	-	View
exp_pytrain.20260312122203.014_20260312_122246 Paper: pytrain.20260312122203.014	Python Skill Fallback Title: Typed Plugin Registry for Model Architectures - Focus: typing.Protocol, typing.TypeVar, typing.Generic, abc, dataclasses, Runtime type checking simulation - Note: Generated fallback due to unavailable model output.	03-12 12:22	Success	-	View
exp_2506.14412v2_20260312_122133 Paper: 2506.14412v2	RAGtifier: Evaluating RAG Generation Approaches of State-of-the-Art RAG Systems for the SIGIR LiveRAG Competition Architecture: Dense retrieval pipeline utilizing Pinecone vectors, a BGE cross-encoder reranker, and InstructRAG for flow control, terminating in a Falcon-3-10B generator. Memory Footprint: High Risk. Falcon-3-10B requires aggre...	03-12 12:21	Success	-	View
exp_2506.15522v1_20260312_122001 Paper: 2506.15522v1	Benchmark: Grounded LLM Inference & Verification Architecture: Standard decoder LLMs augmented with internal reasoning traces. Optimized via GRPO (Group Relative Policy Optimization) using verifiable outcome-based rewards. No architectural changes for memory reduction. **Retrieval...	03-12 12:20	Success	-	View
exp_oa_W4403815812_20260312_121905 Paper: oa_W4403815812	Here is the design for the QAEncoder benchmark. Architecture & Retrieval Strategy QAEncoder is a training-free augmentation for dense retrieval (Dual-Encoder). It bridges the query-document gap by generating Question-Expected Embeddings (QEE)—estimating the center of a query...	03-12 12:19	Success	-	View
exp_2409.10576v2_20260312_121819 Paper: 2409.10576v2	Language Models and Retrieval Augmented Generation for Automated Structured Data Extraction from Diagnostic Reports Architecture & Feasibility: Benchmarks open-weights models (Llama 3, medical fine-tunes) for structured clinical extraction. High implementation feasibility for local deployment. Memory Footprint: Crucially validates that **quantiza...	03-12 12:18	Success	-	View
exp_2409.11353v3_20260312_121729 Paper: 2409.11353v3	Here is the design for a small, runnable benchmark tailored to the THaMES innovation. This benchmark focuses on the effi... Architecture & Implementation: THaMES is a modular framework applying In-Context Learning (ICL), Retrieval-Augmented Generation (RAG), and PEFT (LoRA) to mitigate hallucinations. It automates test generation and benchmarking. **RAG & Re...	03-12 12:17	Success	-	View
exp_pytrain.20260312121538.013_20260312_121607 Paper: pytrain.20260312121538.013	Strict-Typed Plugin Registry with Runtime Validation README.md Strict-Typed Plugin Registry with Runtime Validation Overview This benchmark evaluates the design and implementation of a robust, type-safe plugin system in Python using `typing.Protocol`, `TypeGuard`, and strict packaging hygiene...	03-12 12:16	Success	-	View
exp_2409.13385v2_20260312_121410 Paper: 2409.13385v2	Benchmark: Contextual Compression in RAG Architecture: This survey reviews Contextual Compression paradigms, integrating filtering and condensation modules between the retriever and LLM to process raw retrieved data. Memory Footprint & Speed: Compression reduces input...	03-12 12:14	Success	-	View
exp_2403.12583v1_20260312_121333 Paper: 2403.12583v1	Quantixar: High-performance Vector Data Management System Architecture & Retrieval: Quantixar proposes a vector database architecture utilizing HNSW (Hierarchical Navigable Small World) indexing for Approximate Nearest Neighbor (ANN) search. To manage high-dimensional data, it implements a...	03-12 12:13	Success	-	View
exp_2404.07220v2_20260312_121247 Paper: 2404.07220v2	Blended RAG Benchmark Architecture & Retrieval Strategy: Blended RAG proposes a Hybrid Sparse-Dense Retrieval architecture. It utilizes Dense Vector indexes (semantic search via bi-encoders) blended with Sparse Encoder indexes (lexical search) an...	03-12 12:13	Success	-	View
exp_2309.11392v1_20260312_121149 Paper: 2309.11392v1	This benchmark evaluates the performance of a Retrieval-Augmented Generation (RAG) verification pipeline, inspired by th... Architecture: Hybrid Retrieval-Augmented Verification. Retrieval Strategy: Combines sparse and dense retrieval with neural rerankers on the MS MARCO V1 corpus. Verification Methods: 1. Holistic: Validates the entire generate...	03-12 12:12	Success	-	View
exp_2310.01429v1_20260312_121109 Paper: 2310.01429v1	Chatmap: Geospatial LLM Benchmark Architecture & Feasibility ChatMap utilizes a 1B parameter student model fine-tuned via distillation (using a larger teacher) to interpret OpenStreetMap (OSM) data. This is highly feasible for 8GB VRAM targets; the model require...	03-12 12:11	Success	-	View
exp_pytrain.20260312120920.012_20260312_120945 Paper: pytrain.20260312120920.012	Title: Strictly Typed Configuration Module Benchmark README.md Title: Strictly Typed Configuration Module Benchmark Description: This benchmark evaluates an autonomous coding system's ability to construct a robust, single-file Python module (`config_manager.py`). The module must enfor...	03-12 12:09	Success	-	View
exp_2303.13416v1_20260312_120815 Paper: 2303.13416v1	Title: A Unified Framework for Learned Sparse Retrieval (LSR) Architecture: Unified Learned Sparse Retrieval (LSR) framework using BERT-style encoders (e.g., Splade) to generate sparse lexical representations for inverted indices. Retrieval Specifics: * Retrieval Architecture: Inverted Ind...	03-12 12:08	Success	-	View
exp_2512.12117v1_20260312_120729 Paper: 2512.12117v1	Here is the design for the Citation-Grounded Code Comprehension benchmark. Retrieval Architecture: Hybrid RAG system combining BM25 (sparse), BGE (dense), and Neo4j graph retrieval. Indexing & Context: Indexing leverages code structure, specifically import relationships, to link cross-file dependencies...	03-12 12:07	Success	-	View
exp_cr_10.24908_iqurcp19921_20260312_120640 Paper: cr_10.24908_iqurcp19921	Performing Automated Employment Law Case Analysis Using Large Language Models Architecture: Comparative evaluation of Retrieval-Augmented Generation (RAG) strategies—specifically Vector Chunking, Graph RAG, and Full-Context ("No-processing")—for legal QA on the Sagaz dataset. RAG Specifics: * **Retrieval & In...	03-12 12:06	Success	-	View
exp_2506.17288v1_20260312_120546 Paper: 2506.17288v1	SlimRAG: Retrieval without Graphs via Entity-Aware Context Selection Architecture: SlimRAG is a graph-free, entity-centric framework replacing Knowledge Graph (KG) construction with a lightweight "entity-to-chunk" table. RAG Implementation: * Retrieval Architecture: Entity-aware context selection...	03-12 12:05	Success	-	View
exp_pytrain.20260312120304.011_20260312_120347 Paper: pytrain.20260312120304.011	Strictly Typed Plugin Registry with Runtime Protocol Validation Overview This benchmark evaluates the robustness of a Python plugin architecture utilizing `typing.Protocol` and `@runtime_checkable`. It simulates a system where modules must strictly adhere to a defined interface (`DataProcessor`) before...	03-12 12:03	Success	-	View
exp_2409.10516v3_20260312_120126 Paper: 2409.10516v3	```markdown Architecture & Retrieval Strategy RetrievalAttention offloads the Key-Value (KV) cache from GPU VRAM to CPU DRAM, replacing quadratic attention with a sparse, vector-retrieval mechanism. It constructs Approximate Nearest Neighbor Search...	03-12 12:01	Success	-	View
exp_2403.11366v2_20260312_120006 Paper: 2403.11366v2	JORA: JAX Tensor-Parallel LoRA Benchmark Architecture: JORA utilizes a JAX-based framework featuring just-in-time (JIT) compilation and tensor-sharding (Tensor Parallelism) to enable distributed LoRA fine-tuning of Llama-2 models. Memory Footprint: Reduces per-GPU VRAM con...	03-12 12:00	Success	-	View
exp_2304.00114v1_20260312_115913 Paper: 2304.00114v1	Benchmark: Dense Sparse Retrieval (Efficiency Focus) Architecture: The paper proposes replacing standard dense encoders (e.g., BERT) with sparse-activated language models (specifically Switch Transformers) within a Bi-encoder framework. It utilizes the Tevatron library for impleme...	03-12 11:59	Success	-	View
exp_pytrain.20260312115633.010_20260312_115720 Paper: pytrain.20260312115633.010	Strictly Typed Dynamic Plugin Loader Benchmark README.md Strictly Typed Dynamic Plugin Loader Benchmark This benchmark tests a Python engineer's ability to bridge dynamic runtime code execution with static type safety. Problem Context In large-scale autonomous systems, plugins are often...	03-12 11:57	Success	-	View
exp_2601.03262v1_20260312_115404 Paper: 2601.03262v1	Benchmark: MLLM Roles in Visually Rich Document Retrieval (VRD) Summary: This survey classifies MLLM roles for Visually Rich Document (VRD) retrieval into three architectures: 1. Modality-Unifying Captioners: MLLMs synthesize figures/tables into text. * Retrieval Strategy: Text-to-Text (compat...	03-12 11:55	Success	-	View
exp_2506.12571v1_20260312_115229 Paper: 2506.12571v1	DoTA-RAG Benchmark: Dynamic-of-Thought Aggregation Architecture: DoTA-RAG implements a three-stage pipeline: query rewriting, dynamic routing to specialized sub-indexes, and multi-stage retrieval with ranking. Retrieval Strategy: The system utilizes a re-embedded FineWeb-10BT corpus...	03-12 11:52	Success	-	View
exp_2506.14707v1_20260312_115140 Paper: 2506.14707v1	HARMONY: A Scalable Distributed Vector Database for High-Throughput Approximate Nearest Neighbor Search Paper: HARMONY (Scalable Distributed Vector DB) * Architecture: Distributed Approximate Nearest Neighbor (ANN) engine utilizing a multi-granularity partition strategy. This hybrid approach combines dimension-based and vector...	03-12 11:51	Success	-	View
exp_2506.15246v1_20260312_115025 Paper: 2506.15246v1	TopClustRAG Benchmark Suite Architecture: TopClustRAG utilizes a Hybrid Retrieval Architecture (Sparse + Dense) followed by K-Means clustering to group semantically similar chunks. The system generates distinct, cluster-specific intermediate answers that a...	03-12 11:50	Success	-	View
exp_pytrain.20260312114750.009_20260312_114826 Paper: pytrain.20260312114750.009	Dynamic Package Construction and Runtime Protocol Verification README.md Dynamic Package Construction and Runtime Protocol Verification This benchmark tests an autonomous agent's ability to programmatically generate Python code, construct a valid package structure on the disk, define strict interfaces...	03-12 11:48	Success	-	View
exp_2506.15513v1_20260312_114609 Paper: 2506.15513v1	RePCS: Retrieval-Path Contamination Scoring Benchmark Architecture: RePCS is a model-agnostic diagnostic algorithm, not a new LLM. It detects memorization by calculating the Kullback-Leibler (KL) divergence between two output distributions: a parametric path (Query only) versus a retrieval...	03-12 11:46	Success	-	View
exp_2303.13220v1_20260312_114522 Paper: 2303.13220v1	Parameter-Efficient Sparse Retrievers and Rerankers using Adapters Architecture: Inserts lightweight bottleneck Adapters into SPLADE (Sparse Lexical and Expansion), keeping the heavy Pre-trained Language Model (PLM) frozen. Also applies adapters to rerankers, enabling knowledge transfer between ret...	03-12 11:45	Success	-	View
exp_cr_10.1007_s11227-025-07118-9_20260312_114443 Paper: cr_10.1007_s11227-025-07118-9	Benchmark: GPU-Centric Storage Optimization (ESPN vs. Baseline) Architecture & Retrieval Strategy: This paper proposes a GPU-centric retrieval architecture using GPUDirect Storage (GDS) to bypass CPU bottlenecks, enabling direct SSD-to-GPU data transfer. It introduces **Embedding from Storag...	03-12 11:44	Success	-	View
exp_2506.13589v3_20260312_114351 Paper: 2506.13589v3	AdaVideoRAG Benchmark Architecture: AdaVideoRAG introduces a lightweight Intent Classifier that dynamically routes queries to appropriate retrieval schemes (Naive, Visual, or Knowledge Graph) based on complexity, avoiding unnecessary processing for simpl...	03-12 11:43	Success	-	View
exp_pytrain.20260312114131.008_20260312_114214 Paper: pytrain.20260312114131.008	Robust Distribution Inspector README.md Robust Distribution Inspector Overview The Robust Distribution Inspector is a command-line utility designed to inspect Python packages installed in the current environment. It demonstrates strict type usage using Python's `typ...	03-12 11:42	Success	-	View
exp_oa_W4416322438_20260312_112829 Paper: oa_W4416322438	Benchmark: RAG-Augmented LLM for Yunnan Arabica Coffee Cultivation Architecture & Retrieval: This paper implements a Retrieve–Rerank–Generate pipeline. It employs hybrid retrieval (dense + sparse) fused by Reciprocal Rank Fusion (RRF) and semantic-aware chunking with stable identifiers (`do...	03-12 11:39	Success	-	View
exp_2512.12284v3_20260312_112710 Paper: 2512.12284v3	```markdown V-Rex targets streaming video LLMs on edge devices, specifically addressing memory bandwidth and compute bottlenecks inherent to continuous video processing. * Retrieval Architecture: Implements Dynamic KV Cache Retrieval (ReSV)...	03-12 11:27	Success	-	View
exp_pytrain.20260312112513.007_20260312_112532 Paper: pytrain.20260312112513.007	Python Skill Fallback Title: Robust Generic Tensor Arithmetic Module - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 11:25	Success	-	View
exp_2409.15355v5_20260312_112349 Paper: 2409.15355v5	Benchmark: Block-Attention for Efficient Prefilling in RAG Architecture: Block-Attention decouples context into independent passage blocks. Instead of sequential prefilling, KV states are computed in parallel. Crucially, it enables KV state reuse, allowing cached retrieval passages to be re...	03-12 11:23	Success	-	View
exp_2403.13291v1_20260312_112222 Paper: 2403.13291v1	Late-Interaction Retrieval & Token Pruning Benchmark Architecture: Analyzes Late-Interaction models (ColBERT/COIL), which use multi-vector token embeddings and sum-of-max scoring rather than single-vector dense retrieval. Memory Footprint: Addresses the prohibitive storage cost of...	03-12 11:22	Success	-	View
exp_2506.21593v1_20260312_112121 Paper: 2506.21593v1	PentaRAG Benchmark Simulation Architecture: PentaRAG implements a 5-layer cascading router that prioritizes speed: (1) Fixed KV Cache, (2) Semantic Cache, (3) Memory-Recall (exploiting LLM internal weights), (4) Adaptive Session Memory, and (5) Conventional Retrieva...	03-12 11:21	Success	-	View
exp_2601.06037v4_20260312_112037 Paper: 2601.06037v4	TeleMem: Building Long-Term and Multimodal Memory for Agentic AI Architecture & Retrieval: TeleMem is a RAG-based memory system employing a structured writing pipeline (batching, retrieval, clustering, and consolidation) to maintain narrative user profiles. It integrates a multimodal memory module wi...	03-12 11:20	Success	-	View
exp_pytrain.20260312111831.006_20260312_111859 Paper: pytrain.20260312111831.006	Dynamic Backend Registry with Protocol Validation README.md Dynamic Backend Registry with Protocol Validation Overview This benchmark tests the ability to design a robust, scalable plugin architecture similar to those found in high-performance Machine Learning libraries (e.g., vLLM, Diffus...	03-12 11:19	Success	-	View
exp_2309.13335v2_20260312_111710 Paper: 2309.13335v2	Model-enhanced Vector Index Architecture: MEVI uses a differentiable hybrid architecture combining a Twin-Tower representation model with a Seq2Seq generator, bridged by a Residual Quantization (RQ) codebook. Retrieval Strategy: A two-stage "Generative-to-Dens...	03-12 11:17	Success	-	View
exp_cr_10.54963_jic.v4i2.1706_20260312_111619 Paper: cr_10.54963_jic.v4i2.1706	BERT and Beyond: A Comprehensive Survey of Natural Language Processing Techniques for Information Retrieval Paper Analysis: Survey (Taxonomy & Trends) Architecture: Surveys Dual-Encoder (Bi-Encoder) BERT models for semantic retrieval and Cross-Encoders for reranking. Highlights Hybrid Dense-Sparse architectures (combining vect...	03-12 11:16	Success	-	View
exp_2506.21601v2_20260312_111526 Paper: 2506.21601v2	Hierarchical Patch Compression for ColPali (HPC-ColPali) Benchmark Architecture: Extends ColPali (a VLM-based multi-vector retrieval architecture) with Hierarchical Patch Compression (HPC). * Retrieval Strategy: Utilizes patch-level embeddings. * Indexing: Optimized via HNSW indexing an...	03-12 11:15	Success	-	View
exp_2304.01016v3_20260312_111433 Paper: 2304.01016v3	Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical dual encode... Architecture: Asymmetrical Dual Encoders (Bi-Encoder). Retrieval Strategy: Dense Retrieval via Knowledge Distillation. KALE aligns the pruned query encoder's output distribution to the original teacher using Kullback-Leibler diverge...	03-12 11:14	Success	-	View
exp_pytrain.20260312111138.005_20260312_111221 Paper: pytrain.20260312111138.005	Dynamic Protocol-Based Plugin System Benchmark README.md Dynamic Protocol-Based Plugin System Benchmark Objective This benchmark tests the ability to implement a robust plugin architecture using Python's standard library. The focus is on dynamic code loading from strings, runtime type s...	03-12 11:12	Success	-	View
exp_pytrain.20260312103311.004_20260312_103345 Paper: pytrain.20260312103311.004	Protocol-Enforced Virtual Package Importer README.md Protocol-Enforced Virtual Package Importer Design Brief This coding drill benchmark tests the hypothesis that an autonomous system can construct a robust internal packaging mechanism by extending `sys.meta_path`. The system must i...	03-12 10:33	Success	-	View
exp_pytrain.20260312101232.003_20260312_101306 Paper: pytrain.20260312101232.003	Type-Safe Dynamic Package Generator & Importer Overview This coding drill benchmarks your ability to use Python's standard library for dynamic code generation and runtime module loading. Unlike simple `eval()` or `exec()` calls, this exercise requires the creation of a valid, im...	03-12 10:13	Success	-	View
exp_pytrain.20260312095303.002_20260312_095323 Paper: pytrain.20260312095303.002	Benchmark: PEP 695 Generic Vault with Explicit Public API README.md Benchmark: PEP 695 Generic Vault with Explicit Public API Description This coding drill verifies the implementation of a generic `Vault` class using Python 3.12+ syntax (PEP 695) and a strictly defined public interface using `__al...	03-12 09:53	Success	-	View
exp_pytrain.20260312093112.001_20260312_093149 Paper: pytrain.20260312093112.001	Here is the runnable Python coding drill benchmark designed to your specifications. README.md Generic Repository Package Construction Benchmark Overview This benchmark evaluates a Python system's ability to programmatically scaffold a Python package structure and utilize advanced typing features (specifically `Protocol` an...	03-12 09:31	Success	-	View
exp_hf_2603.10757_20260312_092735 Paper: hf_2603.10757	CodePercept: Code-Grounded Visual STEM Perception Benchmark Analysis for ARES 8GB Roadmap Architecture & Methodology CodePercept proposes a "Code-as-Perception" paradigm, asserting that visual perception—not reasoning—is the bottleneck in STEM tasks. It introduces ICC-1M, a dataset of 1M Ima...	03-12 09:28	Success	-	View
exp_2409.14515v1_20260312_092641 Paper: 2409.14515v1	SPAQ-DL-SLAM: Towards Optimizing Deep Learning-based SLAM for Resource-Constrained Embedded Platforms Architecture: SPAQ-DL-SLAM optimizes DROID-SLAM by applying 20% structured pruning (based on layer-wise sensitivity analysis) and 8-bit post-training static quantization (PTQ) to its deep learning modules. Memory Footprint: Achieves...	03-12 09:26	Success	-	View
exp_pytrain.20260312092411.002_20260312_092435 Paper: pytrain.20260312092411.002	```markdown README.md	03-12 09:24	Success	-	View
exp_2309.16870v1_20260312_092235 Paper: 2309.16870v1	LEF: Late-to-Early Temporal Fusion for LiDAR 3D Object Detection Architecture LEF proposes a recurrent "late-to-early" fusion scheme that injects object-aware latent embeddings into the early stages of a pillar-based detector. It processes temporally aligned sparse pillar tokens using window-based at...	03-12 09:22	Success	-	View
exp_2309.16870v1_20260312_092138 Paper: 2309.16870v1	LEF: Late-to-Early Temporal Fusion Benchmark Architecture LEF proposes a recurrent "late-to-early" fusion scheme that injects object-aware latent embeddings into the early stages of a pillar-based detector. It processes temporally aligned sparse pillar tokens using window-based at...	03-12 09:21	Success	-	View
exp_2512.14879v1_20260312_092048 Paper: 2512.14879v1	README.md Architecture: Proposes Entropy-Reservoir Bregman Projection (ERBP), a theoretical framework for self-referential training. It addresses model collapse via information geometry rather than proposing a new hardware-efficient model archite...	03-12 09:20	Success	-	View
exp_2512.14938v1_20260312_091949 Paper: 2512.14938v1	--- Architecture The model utilizes a 5B parameter Diffusion Transformer (DiT) built upon Wan2.2. To manage long-form generation, it employs a sliding window mechanism with motion-frame context and a high-compression Video VAE. **Memory Foo...	03-12 09:19	Success	-	View
exp_pytrain.20260312091738.001_20260312_091806 Paper: pytrain.20260312091738.001	Runtime-Checked Plugin Architecture Drill README.md Runtime-Checked Plugin Architecture Drill Overview This benchmark demonstrates an autonomous system constructing a robust Python package (`text_ops`) that leverages structural subtyping (Protocols) to define interfaces. It ensures...	03-12 09:18	Success	-	View
exp_2409.14595v1_20260312_091509 Paper: 2409.14595v1	```markdown Architecture: EchoAtt optimizes transformers by sharing attention matrices across layers with high similarity. It utilizes knowledge distillation to train a student model that selectively "echoes" (copies) attention computations from ea...	03-12 09:15	Success	-	View
exp_pytrain.20260312091146.013_20260312_091233 Paper: pytrain.20260312091146.013	Dynamic Protocol-Based Plugin Loader This benchmark demonstrates the hypothesis that utilizing structural subtyping (`typing.Protocol`) combined with dynamic module loading (`importlib`) creates a more flexible and maintainable architecture than traditional, rigid inheritance...	03-12 09:12	Success	-	View
exp_oa_W4395065783_20260312_090948 Paper: oa_W4395065783	This benchmark suite is designed to validate the core efficiency hypotheses presented in "A Survey on Efficient Inferenc... This survey identifies three core architectural bottlenecks for LLM deployment: massive parameter counts, quadratic-complexity attention mechanisms, and auto-regressive decoding. It categorizes solutions into a three-tier taxonomy: 1. **Mem...	03-12 09:09	Success	-	View
exp_hf_2603.08899_20260312_090837 Paper: hf_2603.08899	ConFu: Contemplate the Future for Better Speculative Sampling Architecture: ConFu optimizes speculative decoding by introducing "contemplate tokens" and soft prompts into the draft model. It employs a lightweight Mixture-of-Experts (MoE) layer to dynamically predict future context, reducing the er...	03-12 09:08	Success	-	View
exp_hf_2603.10744_20260312_090743 Paper: hf_2603.10744	--- Architecture: JiT is a training-free inference framework targeting spatial redundancy in Diffusion Transformers (DiT). It replaces full latent processing with a spatially approximated generative ODE, driven by a dynamically sele...	03-12 09:07	Success	-	View
exp_hf_2603.10705_20260312_090644 Paper: hf_2603.10705	Prism-Δ: Differential Subspace Steering for Prompt Highlighting in Large Language Models Architecture: PRISM-Δ steers generation by decomposing the difference between positive and negative cross-covariance matrices to isolate discriminative directions. It utilizes continuous softplus weighting for attention heads—allowing w...	03-12 09:06	Success	-	View
exp_pytrain.20260312090424.012_20260312_090457 Paper: pytrain.20260312090424.012	Type-Safe Plugin Architecture with `importlib` README.md Type-Safe Plugin Architecture with `importlib` This benchmark implements a zero-dependency plugin registry system inspired by HuggingFace Transformers. It demonstrates how to use Python's `typing` module (Generics, TypeVars) to en...	03-12 09:05	Success	-	View
exp_2304.00280v1_20260312_090304 Paper: 2304.00280v1	Benchmark: Progressive Channel-Shrinking Network (PCS) Architecture: Introduces Progressive Channel-Shrinking (PCS) to replace unstable gating functions in salience-based pruning. It employs a Running Shrinking Policy (RSP) to transition from dynamic training to a testing-static pruning...	03-12 09:03	Success	-	View
exp_2512.14925v2_20260312_090159 Paper: 2512.14925v2	Here is the runnable benchmark for the Multiscale Aggregated Hierarchical Attention (MAHA) innovation. Architecture: MAHA replaces standard MHSA with a hybrid dilated-convolutional transformer backbone. It utilizes learnable downsampling to partition inputs into hierarchical scales and aggregates attention maps using differentiable conve...	03-12 09:02	Success	-	View
exp_2403.18159v2_20260312_090048 Paper: 2403.18159v2	Benchmark for "Oh! We Freeze" (OV-Freeze) Architecture: Introduces ov-freeze, a lightweight Quantization-Aware Knowledge Distillation (KD-QAT) technique. It stabilizes the training of 4-bit weight quantized LLMs by addressing gradient propagation vulnerabilities identified...	03-12 09:01	Success	-	View
exp_pytrain.20260312085711.011_20260312_085758 Paper: pytrain.20260312085711.011	This document describes the "Runtime Checkable Plugin Loader" benchmark. README.md This document describes the "Runtime Checkable Plugin Loader" benchmark. Overview This benchmark tests the ability to implement a robust, dynamic plugin system using Python's standard library. It focuses on structural subtyping (P...	03-12 08:58	Success	-	View
exp_2506.16600v2_20260312_085525 Paper: 2506.16600v2	FLAME: Federated Fine-Tuning Benchmark FLAME proposes a Sparse Mixture-of-Experts (SMoE) framework for federated LLM fine-tuning, designed to eliminate the performance degradation caused by compressing LoRA matrices on low-resource clients. * Architecture: Replaces stand...	03-12 08:55	Success	-	View
exp_2506.16640v4_20260312_085418 Paper: 2506.16640v4	Benchmark: Adaptive-Scalable Entmax (ASEntmax) Simulation Architecture Proposes Adaptive-Scalable Entmax (ASEntmax), a drop-in replacement for Softmax attention. It utilizes $\alpha$-entmax to assign exact zeros to irrelevant tokens, creating dynamically sparse attention maps. A learnable...	03-12 08:54	Success	-	View
exp_oa_W4404313603_20260312_085338 Paper: oa_W4404313603	Here is the runnable benchmark for the Small Language Model (SLM) innovation, focusing on Dynamic Precision (Mixed Pre... Architecture:** Reviews compact transformer designs and Small Language Models (typically <7B parameters) optimized for edge environments. It highlights architectural trade-offs that maintain task performance while reducing parameter count...	03-12 08:53	Success	-	View
exp_2309.16795v2_20260312_085243 Paper: 2309.16795v2	Benchmark: Ultra-low-power Image Classification (Quartz SNN) Paper: Ultra-low-power Image Classification on Neuromorphic Hardware (Quartz) Architecture: Proposes "Quartz," a temporal conversion method that translates stateless ANNs to Spiking Neural Networks (SNNs) using Time-To-First-Spike (...	03-12 08:53	Success	-	View
exp_2304.00335v1_20260312_085153 Paper: 2304.00335v1	Here is the runnable benchmark for the Volumetric Attribute Compression innovation. Architecture Replaces RAHT’s piecewise constant functions with a feedforward linear network implementing higher-order B-spline bases. The core mechanism is a space-varying convolution (Geometric Attention) where weights are dynamica...	03-12 08:51	Success	-	View
exp_pytrain.20260312084937.010_20260312_085018 Paper: pytrain.20260312084937.010	Type-Safe Pipeline Package Benchmark README.md Type-Safe Pipeline Package Benchmark This benchmark evaluates a Python implementation of a modular, type-safe data processing pipeline. The implementation leverages advanced Python `typing` features, including Generics, Protocols,...	03-12 08:50	Success	-	View
exp_oa_W4416386252_20260312_084802 Paper: oa_W4416386252	Which Heads Matter for Reasoning? RL-Guided KV Cache Compression Architecture: RLKV utilizes offline Reinforcement Learning to probe and identify specific attention heads critical for generative reasoning and Chain-of-Thought (CoT) stability. Unlike static pruning, it optimizes head selection by dire...	03-12 08:48	Success	-	View
exp_hf_2603.09488_20260312_084623 Paper: hf_2603.09488	Streaming Autoregressive Video Generation via Diagonal Distillation Architecture Proposes Diagonal Distillation, an asymmetric autoregressive strategy. It allocates higher denoising steps to initial video chunks to establish high-fidelity features, while subsequent chunks use significantly fewer ste...	03-12 08:46	Success	-	View
exp_2601.11557v1_20260312_084437 Paper: 2601.11557v1	Benchmark: Information-Theoretic Binarization vs. Float32 ANN Architecture: Replaces the standard "HNSW + float32" stack with Maximally Informative Binarization (MIB). The system utilizes exhaustive search over 1-bit binary vectors using bitwise distance metrics and Information-Theoretic Scori...	03-12 08:45	Success	-	View
exp_pytrain.20260312084136.009_20260312_084229 Paper: pytrain.20260312084136.009	Strictly Typed Modular Data Processor This benchmark evaluates the implementation of a data processing system using Python's structural subtyping features and strict module packaging hygiene. Overview The drill requires the creation of a single-file module (`benchmark.py` which...	03-12 08:42	Success	-	View
exp_hf_2603.02188_20260312_084023 Paper: hf_2603.02188	Multi-Head Low-Rank Attention (MLRA) Benchmark Architecture MLRA modifies Multi-Head Latent Attention (MLA) by replacing the non-partitionable single latent head with a multi-head latent structure. This allows the latent Key-Value states to be effectively sharded across GPUs. **Memo...	03-12 08:40	Success	-	View
exp_oa_W4416557533_20260312_083858 Paper: oa_W4416557533	Small Language Models (SLM) Efficiency Benchmark Architecture: Survey of design frameworks and training methodologies for edge-compatible Small Language Models (SLMs). Memory Footprint: Focuses heavily on minimizing model size through optimization techniques, specifically pruning,...	03-12 08:39	Success	-	View
exp_oa_W4415037605_20260312_083754 Paper: oa_W4415037605	Hardware-Efficient Attention for Fast Decoding Summary for ARES 8GB Roadmap * Architecture: Proposes Grouped-Tied Attention (GTA) and Grouped Latent Attention (GLA). Both mechanisms optimize arithmetic intensity by reusing key-value states (GTA) or utilizing parallel-fri...	03-12 08:37	Success	-	View
exp_pytrain.20260312083445.008_20260312_083532 Paper: pytrain.20260312083445.008	Generic Plugin Registry with Semantic Versioning README.md Generic Plugin Registry with Semantic Versioning This benchmark demonstrates a robust, self-contained module loader that simulates a mini packaging ecosystem. Objectives 1. PEP 695 Implementation: Utilize Python 3.12+ Type Par...	03-12 08:35	Success	-	View
exp_oa_W4415048600_20260312_082323 Paper: oa_W4415048600	```markdown Analysis for ARES 8GB Roadmap: * Architecture: Prioritizes hybrid edge-cloud collaborative systems (e.g., EdgeShard) and microservices over monolithic designs. Suggests leveraging intelligent workload distribution to bypass local ha...	03-12 08:33	Success	-	View
exp_cr_10.3389_frobt.2025.1518965_20260312_082215 Paper: cr_10.3389_frobt.2025.1518965	A survey of model compression techniques: past, present, and future This paper provides a comprehensive methodological framework for optimizing Large Language Models (LLMs) within the ARES 8GB hardware constraints. As a survey, it does not propose a specific architecture but evaluates compression techniques...	03-12 08:22	Success	-	View
exp_oa_W4415098413_20260312_082141 Paper: oa_W4415098413	Artificial Hippocampus Networks (AHN) Benchmark Architecture: A hybrid framework combining a 32k sliding window attention buffer (short-term memory) with a learnable recurrent compressor (Artificial Hippocampus Network) for long-term memory. The AHN utilizes modern RNN architectures...	03-12 08:21	Success	-	View
exp_pytrain.20260312081922.007_20260312_081950 Paper: pytrain.20260312081922.007	Strictly Typed Generic Registry with Package Metadata An autonomous coding system can effectively utilize Python's advanced type system (Protocols and Generics) to enforce interface safety while simultaneously adhering to library packaging standards (`__all__`, versioning) to ensure API stabil...	03-12 08:19	Success	-	View
exp_2512.14946v1_20260312_081715 Paper: 2512.14946v1	EVICPRESS: Joint KV-Cache Compression and Eviction for Efficient LLM Serving Summary for ARES 8GB Roadmap: * Architecture: A multi-tier KV management system (GPU VRAM to CPU RAM) that jointly optimizes eviction and lossy compression. It utilizes a "unified utility function" to balance quality loss against la...	03-12 08:17	Success	-	View
exp_oa_W4410363086_20260312_081534 Paper: oa_W4410363086	Distributed & Multimodal LLM Benchmark This survey advocates for distributed architectures—including data, model, and pipeline parallelism—to mitigate the memory and computational constraints of centralized Large Language Models (LLMs) and Multimodal LLMs (MLLMs). * **Architectu...	03-12 08:16	Success	-	View
exp_oa_W4416458930_20260312_081349 Paper: oa_W4416458930	On-Device Large Language Models: A Survey of Model Compression and System Optimization This survey systematizes on-device LLM optimization (1-4B parameters) using the ALEM (Accuracy, Latency, Energy, Memory) protocol. * Architecture: Advocates for hybrid pipelines combining quantization, structured pruning with mergea...	03-12 08:14	Success	-	View
exp_pytrain.20260312081042.006_20260312_081110 Paper: pytrain.20260312081042.006	Strictly Typed Dynamic Plugin Loader with Validation README.md Strictly Typed Dynamic Plugin Loader with Validation This benchmark demonstrates a robust, enterprise-grade plugin architecture using Python's standard library. It leverages `typing.Protocol` to enforce structural sub-typing (Stat...	03-12 08:11	Success	-	View
exp_pytrain.20260312080135.005_20260312_080217 Paper: pytrain.20260312080135.005	Dynamic Type-Safe Plugin Loader Benchmark README.md Dynamic Type-Safe Plugin Loader Benchmark Overview This benchmark evaluates the ability of a Python execution environment to implement a robust, type-safe plugin architecture using only the standard library. It tests the integrati...	03-12 08:02	Success	-	View
exp_pytrain.20260312074012.004_20260312_074102 Paper: pytrain.20260312074012.004	Python Skill Fallback Title: Strictly Typed Plugin Loader with Public API Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-12 07:41	Success	-	View
exp_pytrain.20260312073002.003_20260312_073025 Paper: pytrain.20260312073002.003	Typed Package Metadata Auditor README.md Typed Package Metadata Auditor This benchmark evaluates the system's ability to generate robust, type-safe Python tooling using only the standard library. Goal: Create a self-contained script `benchmark.py` that acts as a pack...	03-12 07:30	Success	-	View
exp_pytrain.20260312072152.002_20260312_072217 Paper: pytrain.20260312072152.002	PEP 695 Generic Dependency Resolver Benchmark This benchmark evaluates the implementation of a `DependencyGraph` using Python 3.12+'s PEP 695 Type Parameter Syntax. Requirements - Python Version: 3.12 or higher (required for PEP 695 syntax). - Dependencies: None (Standard Libra...	03-12 07:22	Success	-	View
exp_self.20260312071726.002_20260312_071812 Paper: self.20260312071726.002	Frequency-Modulated State Spaces (FMSS) Benchmark README.md Frequency-Modulated State Spaces (FMSS) Benchmark This benchmark evaluates the Frequency-Modulated State Spaces (FMSS) innovation, which applies multi-rate signal processing concepts to State Space Models (SSMs). The Innovatio...	03-12 07:18	Success	-	View
exp_self.20260312071539.001_20260312_071616 Paper: self.20260312071539.001	Entropy-Triggered State Snapshot (ETSS) Benchmark This benchmark evaluates the Entropy-Triggered State Snapshot (ETSS) hypothesis. The core idea is that in Low Entropy contexts (e.g., repetitive code, templates), the internal state of a State Space Model (SSM) changes minimally. By cal...	03-12 07:16	Success	-	View
exp_pytrain.20260312071407.001_20260312_071443 Paper: pytrain.20260312071407.001	Strictly Typed Generic Pipeline Benchmark README.md Strictly Typed Generic Pipeline Benchmark This benchmark evaluates a Python engineer's ability to design a robust, type-safe data processing framework using Python's `typing` module. Architecture Overview The solution implements a...	03-12 07:14	Success	-	View
exp_pytrain.20260310062524.001_20260310_062551 Paper: pytrain.20260310062524.001	Python Skill Fallback Title: Strict-Type Dynamic Module Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-10 06:25	Success	-	View
exp_self.20260309152420.007_20260309_152446 Paper: self.20260309152420.007	Section 1: README.md bash python benchmark.py	03-09 15:24	Success	-	View
exp_pytrain.20260309152138.004_20260309_152200 Paper: pytrain.20260309152138.004	```markdown No summary available yet.	03-09 15:22	Success	-	View
exp_self.20260309151933.006_20260309_152002 Paper: self.20260309151933.006	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309151933.006 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 15:20	Success	-	View
exp_self.20260309151700.005_20260309_151725 Paper: self.20260309151700.005	Here is the runnable benchmark design for the SSM Strategy Stress Test. README.md bash python benchmark.py	03-09 15:17	Success	-	View
exp_pytrain.20260309151409.003_20260309_151434 Paper: pytrain.20260309151409.003	```markdown bash python benchmark.py ``` 3. The script will create a temporary directory structure, generate mock plugins, and attempt to load them. 4. It will verify that valid plugins are accepted and invalid ones are rejected based on the `Command`...	03-09 15:14	Success	-	View
exp_self.20260309151226.004_20260309_151249 Paper: self.20260309151226.004	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309151226.004 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 15:12	Success	-	View
exp_self.20260309150928.003_20260309_150958 Paper: self.20260309150928.003	Self-directed benchmark: ssm strategy stress test README.md Self-directed benchmark: ssm strategy stress test Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy improves throughput and reduces VRAM usage compar...	03-09 15:10	Success	-	View
exp_pytrain.20260309150626.002_20260309_150719 Paper: pytrain.20260309150626.002	Generic Package Manifest Validator using PEP 695 Overview This benchmark evaluates the developer experience and runtime characteristics of Python 3.12's PEP 695 Type Parameter Syntax within the context of a generic package metadata validation system. Features * **PEP 695 Implementatio...	03-09 15:07	Success	-	View
exp_self.20260309150417.002_20260309_150447 Paper: self.20260309150417.002	Section 1: README.md bash python benchmark.py	03-09 15:05	Success	-	View
exp_self.20260309150111.001_20260309_150135 Paper: self.20260309150111.001	SSM Strategy Stress Test: Memory vs. Throughput README.md SSM Strategy Stress Test: Memory vs. Throughput Overview This benchmark evaluates the "disciplined memory policy" hypothesis for State Space Models (SSMs). The Innovation We compare a standard Transformer (Baseline) agains...	03-09 15:01	Success	-	View
exp_pytrain.20260309145820.001_20260309_145847 Paper: pytrain.20260309145820.001	Strictly Typed Package Manifest Generator This benchmark evaluates the creation of a strictly typed Python packaging utility using standard library type hinting features (PEP 484, PEP 621). Objective The goal is to write a script `manifest_gen.py` that simulates a lightweight packa...	03-09 14:58	Success	-	View
exp_pytrain.20260309145550.008_20260309_145612 Paper: pytrain.20260309145550.008	Generic Typed Registry Library Implementation README.md Generic Typed Registry Library Implementation This project implements a robust, type-safe registry component using Python 3.12's modern type parameter syntax (PEP 695). Features - Type Safety: Uses `class Registry[T]:` syntax...	03-09 14:56	Success	-	View
exp_self.20260309145401.013_20260309_145439 Paper: self.20260309145401.013	SSM Strategy Stress Test: Memory Policy Benchmark README.md SSM Strategy Stress Test: Memory Policy Benchmark Overview This benchmark evaluates the Innovation: Disciplined Memory Policy for State Space Models (SSM). The hypothesis is that applying strict memory management—specifically...	03-09 14:54	Success	-	View
exp_self.20260309145113.012_20260309_145140 Paper: self.20260309145113.012	SSM Strategy Stress Test Benchmark README.md SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that applying a State Space Model (SSM) strategy—specifically a disciplined memory policy based on chunking and state recurrence—improves throughput under...	03-09 14:51	Success	-	View
exp_pytrain.20260309144803.007_20260309_144839 Paper: pytrain.20260309144803.007	Python Skill Fallback Title: Generic Registry with Dynamic CLI Dispatcher - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 14:48	Success	-	View
exp_self.20260309144608.011_20260309_144643 Paper: self.20260309144608.011	SSM Strategy Stress Test Benchmark README.md SSM Strategy Stress Test Benchmark This benchmark evaluates the memory efficiency and throughput of a Selective State Space Model (SSM) strategy versus a standard Attention-based baseline (Transformer) under constrained memory con...	03-09 14:46	Success	-	View
exp_self.20260309144339.010_20260309_144406 Paper: self.20260309144339.010	SSM Strategy Stress Test: Memory vs Throughput README.md SSM Strategy Stress Test: Memory vs Throughput This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy improves throughput under constrained VRAM (8GB) compared to standard a...	03-09 14:44	Success	-	View
exp_pytrain.20260309144048.006_20260309_144120 Paper: pytrain.20260309144048.006	Type-Safe Sliding Window KV Cache Implementation README.md Type-Safe Sliding Window KV Cache Implementation This benchmark evaluates the ability to implement a robust, type-safe data structure using only the Python standard library, mimicking the core logic of Key-Value (KV) caches found...	03-09 14:41	Success	-	View
exp_self.20260309143901.009_20260309_143929 Paper: self.20260309143901.009	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309143901.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 14:39	Success	-	View
exp_self.20260309143559.008_20260309_143623 Paper: self.20260309143559.008	Here is the runnable benchmark code designed to test the SSM strategy hypothesis. README.md SSM Strategy Stress Test: Dynamic Precision & Memory Policy Overview This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy (specifically leveraging *Dynamic Precision...	03-09 14:36	Success	-	View
exp_pytrain.20260309143255.005_20260309_143336 Paper: pytrain.20260309143255.005	Strictly-Typed Plugin Registry System Design Brief This benchmark evaluates a Python implementation of a modular Plugin Registry system. The system leverages Python's advanced typing features—specifically `typing.TypeVar`, `abc.ABC`, and `typing.Protocol`—to enforce compile...	03-09 14:33	Success	-	View
exp_self.20260309143037.007_20260309_143124 Paper: self.20260309143037.007	Section 1: README.md bash pip install torch transformers accelerate bash python benchmark.py MODE: ablated_fp32 VRAM_USAGE: <value>MB TOKENS_PER_SEC: <value> RESULT: <status> --- MODE: optimized_bf16 VRAM_USAGE: <value>MB TOKENS_PER_SEC: <value> RESULT: <status...	03-09 14:31	Success	-	View
exp_self.20260309142742.006_20260309_142819 Paper: self.20260309142742.006	SSM Strategy Stress Test README.md SSM Strategy Stress Test Overview This benchmark evaluates a State Space Model (SSM) workload under strict memory constraints (simulating an 8GB VRAM limit). It compares a standard baseline implementation against an **optimize...	03-09 14:28	Success	-	View
exp_pytrain.20260309142438.004_20260309_142516 Paper: pytrain.20260309142438.004	Benchmark: Strictly-Typed Configuration Abstraction Layer README.md Benchmark: Strictly-Typed Configuration Abstraction Layer Overview This benchmark evaluates the design and implementation of a strictly-typed, generic configuration system in Python. It focuses on leveraging Python's `typing` modu...	03-09 14:25	Success	-	View
exp_self.20260309142146.005_20260309_142225 Paper: self.20260309142146.005	Benchmark: SSM Strategy Stress Test README.md Benchmark: SSM Strategy Stress Test This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy, specifically employing a disciplined memory policy (recurrent state caching) and dynamic precision, yields superi...	03-09 14:23	Success	-	View
exp_self.20260309141907.004_20260309_141945 Paper: self.20260309141907.004	SSM Strategy Stress Test README.md SSM Strategy Stress Test Innovation: Disciplined SSM Memory Policy This benchmark tests the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy (chunking/caching + dynamic precision) improves in...	03-09 14:19	Success	-	View
exp_pytrain.20260309141526.003_20260309_141640 Paper: pytrain.20260309141526.003	Runtime-Verified Plugin Loader Benchmark This benchmark evaluates your ability to construct a robust, modular plugin architecture using Python's standard library. The goal is to implement a plugin loader that utilizes structural subtyping (`typing.Protocol`) for runtime safety and...	03-09 14:16	Success	-	View
exp_self.20260309141307.003_20260309_141335 Paper: self.20260309141307.003	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309141307.003 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 14:13	Success	-	View
exp_self.20260309141002.002_20260309_141049 Paper: self.20260309141002.002	This benchmark evaluates the SSM Strategy Stress Test. README.md This benchmark evaluates the SSM Strategy Stress Test. Hypothesis: Applying a State Space Model (SSM) approach with a disciplined memory policy (fixed state size) improves throughput compared to standard attention mechanis...	03-09 14:10	Success	-	View
exp_pytrain.20260309140641.002_20260309_140737 Paper: pytrain.20260309140641.002	PEP 695 Generic Plugin Loader Benchmark README.md PEP 695 Generic Plugin Loader Benchmark Overview This coding drill tests the implementation of Python 3.12's PEP 695 Type Parameter Syntax within the context of a dynamic plugin architecture. It demonstrates how the new syntax red...	03-09 14:07	Success	-	View
exp_self.20260309140320.001_20260309_140409 Paper: self.20260309140320.001	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309140320.001 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 14:04	Success	-	View
exp_pytrain.20260309140012.001_20260309_140048 Paper: pytrain.20260309140012.001	Strictly Typed Dependency Resolver Benchmark README.md Strictly Typed Dependency Resolver Benchmark Overview This benchmark implements a minimal dependency resolution engine utilizing Python's advanced static typing features (`TypedDict`, `Protocol`, and `Generics`). It demonstrates h...	03-09 14:00	Success	-	View
exp_pytrain.20260309135710.030_20260309_135745 Paper: pytrain.20260309135710.030	Asynchronous Data Pipeline with Strict Typing README.md Asynchronous Data Pipeline with Strict Typing Overview This coding drill evaluates your ability to construct a robust, IO-bound data processing pipeline using modern Python type hinting (PEP 484) and asynchronous programming primi...	03-09 13:57	Success	-	View
exp_self.20260309135203.054_20260309_135230 Paper: self.20260309135203.054	Here is the runnable benchmark design. Section 1: README.md This benchmark compares a standard Transformer-based approach (Baseline) against an SSM-inspired Linear Recurrent approach (Optimized) to test the hypothesis that disciplined memory policies improve throughput under con...	03-09 13:55	Success	-	View
exp_pytrain.20260309134923.029_20260309_134943 Paper: pytrain.20260309134923.029	Runtime Interface Compliance Validator using Importlib README.md Runtime Interface Compliance Validator using Importlib Overview This coding drill implements a robust plugin architecture validation system. It demonstrates how to use Python's `typing.Protocol` to enforce structural subtyping (du...	03-09 13:49	Success	-	View
exp_self.20260309134723.053_20260309_134756 Paper: self.20260309134723.053	Here is the runnable benchmark design. bash pip install torch python benchmark.py ```	03-09 13:48	Success	-	View
exp_self.20260309134448.052_20260309_134516 Paper: self.20260309134448.052	Here is the runnable benchmark for the SSM Strategy Stress Test. README.md Self-directed benchmark: SSM Strategy Stress Test Hypothesis Applying SSM (State Space Model) logic with a disciplined memory policy (specifically dynamic precision and selective state caching) improves inference throughput and re...	03-09 13:45	Success	-	View
exp_pytrain.20260309134146.028_20260309_134238 Paper: pytrain.20260309134146.028	Generic Resource Manager & ZipApp Packager Benchmark This benchmark tests a developer's ability to leverage modern Python type hinting (PEP 695) to create strict, generic data structures, and then utilize standard library packaging tools (`zipapp`) to distribute them. Prerequisites * **Python...	03-09 13:42	Success	-	View
exp_self.20260309134004.051_20260309_134035 Paper: self.20260309134004.051	SSM Strategy Stress Test Benchmark README.md SSM Strategy Stress Test Benchmark This benchmark evaluates the hypothesis that SSM (State Space Model) strategies, particularly those mimicking Mamba-style memory management, offer superior throughput and lower VRAM footprint...	03-09 13:40	Success	-	View
exp_self.20260309133738.050_20260309_133813 Paper: self.20260309133738.050	SSM Strategy Stress Test Benchmark README.md SSM Strategy Stress Test Benchmark This benchmark evaluates the memory efficiency and throughput of a State Space Model (SSM) strategy against a standard Attention-based (Transformer) mechanism under constrained VRAM conditions (s...	03-09 13:38	Success	-	View
exp_pytrain.20260309133519.027_20260309_133540 Paper: pytrain.20260309133519.027	Strictly-Typed Component Registry with Dynamic Imports README.md Strictly-Typed Component Registry with Dynamic Imports Overview This coding drill demonstrates the creation of a robust, type-safe plugin architecture using Python's standard library. It leverages advanced `typing` features (Gener...	03-09 13:35	Success	-	View
exp_self.20260309133324.049_20260309_133358 Paper: self.20260309133324.049	SSM Strategy Stress Test Benchmark README.md SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the performance of a Selective State Space Model (SSM) implementation under constrained memory conditions (simulating an 8GB VRAM limit). It compares a **Standar...	03-09 13:34	Success	-	View
exp_self.20260309133058.048_20260309_133119 Paper: self.20260309133058.048	```markdown README.md	03-09 13:31	Success	-	View
exp_pytrain.20260309132844.026_20260309_132908 Paper: pytrain.20260309132844.026	Typed Dependency Injection Container Benchmark README.md Typed Dependency Injection Container Benchmark Overview This benchmark tests the engineering capability to construct a robust, type-driven Dependency Injection (DI) Container from scratch using only the Python Standard Library...	03-09 13:29	Success	-	View
exp_self.20260309132655.047_20260309_132727 Paper: self.20260309132655.047	Title: SSM Strategy Stress Test: Linear vs. Quadratic Memory README.md Title: SSM Strategy Stress Test: Linear vs. Quadratic Memory Hypothesis: Applying SSM (State Space Model) logic with a disciplined memory policy (constant state size) improves throughput under 8GB constraints compared to s...	03-09 13:27	Success	-	View
exp_self.20260309132423.046_20260309_132453 Paper: self.20260309132423.046	Here is the design for the SSM Strategy Stress Test benchmark. Design Rationale This benchmark compares a standard Transformer Encoder (which relies on $O(N^2)$ Attention) against a custom State Space Model (SSM) implementation (which relies on $O(N)$ recurrence). * Innovation: The `SSM_Mamba` modu...	03-09 13:25	Success	-	View
exp_pytrain.20260309132156.025_20260309_132216 Paper: pytrain.20260309132156.025	Strictly Typed Plugin Registry with Runtime Validation This benchmark evaluates a Python developer's ability to construct robust, maintainable plugin architectures using modern type hinting features (`typing.Protocol`, `typing.Generic`, `typing.TypeVar`) and runtime validation mechanisms. Overv...	03-09 13:22	Success	-	View
exp_self.20260309132000.045_20260309_132037 Paper: self.20260309132000.045	README.md bash python benchmark.py	03-09 13:20	Success	-	View
exp_self.20260309131702.044_20260309_131738 Paper: self.20260309131702.044	Self-directed benchmark: ssm strategy stress test Hypothesis Applying an SSM (State Space Model) with a disciplined memory policy (fixed state size vs. growing KV cache) improves throughput and reduces VRAM pressure under 8GB constraints compared to standard attention mechanisms. Plan This...	03-09 13:17	Success	-	View
exp_pytrain.20260309131430.024_20260309_131453 Paper: pytrain.20260309131430.024	Section 1: README.md Runtime-Verified Plugin Loader with Strict Typing Overview This benchmark tests the ability to construct a robust plugin system in Python using `typing.Protocol` and `runtime_checkable`. It simulates a high-assurance environment where stati...	03-09 13:14	Success	-	View
exp_self.20260309131255.043_20260309_131326 Paper: self.20260309131255.043	I will create a benchmark for "SSM Memory Policy Stress Test". The code will define a synthetic SSM workload using pure... README.md This section explains the purpose, setup, and interpretation of the benchmark. benchmark.py This section contains the runnable code. - It defines a simplified SSM block (Selective State Space). - It implements two modes: `...	03-09 13:13	Success	-	View
exp_self.20260309131032.042_20260309_131059 Paper: self.20260309131032.042	This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy with a disciplined memory policy improve... README.md This benchmark evaluates the hypothesis that a State Space Model (SSM) strategy with a disciplined memory policy improves throughput under 8GB VRAM constraints compared to a standard baseline (simulated via dense linear layers/sta...	03-09 13:11	Success	-	View
exp_pytrain.20260309130821.023_20260309_130841 Paper: pytrain.20260309130821.023	Title: Strictly-Typed Dynamic Plugin Loader README.md Title: Strictly-Typed Dynamic Plugin Loader Topic: `typing`, `packaging`, `importlib` Overview: This benchmark evaluates the ability to construct a robust dynamic module loading system using only the Python standard li...	03-09 13:08	Success	-	View
exp_self.20260309130609.041_20260309_130640 Paper: self.20260309130609.041	```markdown README.md	03-09 13:06	Success	-	View
exp_self.20260309130352.040_20260309_130419 Paper: self.20260309130352.040	This benchmark validates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory p... README.md This benchmark validates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy significantly improves throughput and reduces VRAM overhead compared to naive implementations under cons...	03-09 13:04	Success	-	View
exp_pytrain.20260309130042.022_20260309_130123 Paper: pytrain.20260309130042.022	Python Skill Fallback Title: Dynamic Generic Plugin Loader with PEP 695 - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 13:01	Success	-	View
exp_self.20260309124842.039_20260309_124908 Paper: self.20260309124842.039	```markdown bash python benchmark.py	03-09 12:59	Success	-	View
exp_pytrain.20260309124621.021_20260309_124645 Paper: pytrain.20260309124621.021	Strictly Typed Generic Data Pipeline Benchmark README.md Strictly Typed Generic Data Pipeline Benchmark Overview This benchmark evaluates the implementation of a robust, modular data processing pipeline using Python's advanced typing features. It enforces strict standards regarding API...	03-09 12:46	Success	-	View
exp_self.20260309124417.038_20260309_124454 Paper: self.20260309124417.038	SSM Strategy Stress Test: Dynamic Precision Benchmarking README.md SSM Strategy Stress Test: Dynamic Precision Benchmarking Overview This benchmark evaluates the performance impact of applying a Dynamic Precision memory policy to a State Space Model (SSM) architecture. It simulates a lightwei...	03-09 12:45	Success	-	View
exp_self.20260309124047.037_20260309_124121 Paper: self.20260309124047.037	SSM Strategy Stress Test Benchmark README.md SSM Strategy Stress Test Benchmark This benchmark evaluates the performance efficiency of State Space Models (SSM) compared to standard Attention mechanisms when processing long sequences under constrained memory (8GB VRAM target)...	03-09 12:42	Success	-	View
exp_pytrain.20260309123836.020_20260309_123857 Paper: pytrain.20260309123836.020	Benchmark: Typed Plugin Architecture for Model Registry README.md Benchmark: Typed Plugin Architecture for Model Registry This benchmark demonstrates the implementation of a robust, type-safe plugin system often found in modern Machine Learning frameworks (like LitGPT or PyTorch). It enforces st...	03-09 12:39	Success	-	View
exp_self.20260309122642.036_20260309_122715 Paper: self.20260309122642.036	Self-directed benchmark: ssm strategy stress test README.md Self-directed benchmark: ssm strategy stress test Hypothesis Applying ssm with disciplined memory policy improves throughput under 8GB constraints. Plan Benchmark a standard caching mechanism (Baseline) against a fixed-state SSM-l...	03-09 12:37	Success	-	View
exp_pytrain.20260309122420.019_20260309_122441 Paper: pytrain.20260309122420.019	Dynamic Plugin Loader with Protocol Enforcement README.md Title: Dynamic Plugin Loader with Protocol Enforcement Description Modern ML frameworks like HuggingFace Transformers rely on dynamic module loading to support hundreds of model architectures without hard-coding dependencies. This...	03-09 12:24	Success	-	View
exp_self.20260309122202.035_20260309_122239 Paper: self.20260309122202.035	SSM Strategy Stress Test README.md SSM Strategy Stress Test This benchmark evaluates the SSM Strategy Stress Test, focusing on the hypothesis that a disciplined memory policy combined with State Space Model (SSM) architectures improves throughput under strict 8...	03-09 12:23	Success	-	View
exp_self.20260309121947.034_20260309_122013 Paper: self.20260309121947.034	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309121947.034 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 12:20	Success	-	View
exp_pytrain.20260309121712.018_20260309_121735 Paper: pytrain.20260309121712.018	Dynamic Type-Verified Plugin Loader README.md Dynamic Type-Verified Plugin Loader Overview This benchmark evaluates a Python system's ability to dynamically generate code, manage temporary package structures, and verify runtime type safety using the `typing` module. Problem D...	03-09 12:17	Success	-	View
exp_self.20260309121540.033_20260309_121610 Paper: self.20260309121540.033	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309121540.033 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 12:16	Success	-	View
exp_self.20260309121312.032_20260309_121342 Paper: self.20260309121312.032	Here is the benchmark design for the SSM Strategy Stress Test, focusing on disciplined memory policies (specifically Dyn... bash python benchmark.py	03-09 12:13	Success	-	View
exp_pytrain.20260309121059.017_20260309_121122 Paper: pytrain.20260309121059.017	Python Skill Fallback Title: Runtime Type-Safe Dynamic Package Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 12:11	Success	-	View
exp_self.20260309120912.031_20260309_120939 Paper: self.20260309120912.031	SSM Strategy Stress Test: Memory vs. Throughput README.md SSM Strategy Stress Test: Memory vs. Throughput Innovation: Selective State Space Model (SSM) vs. Standard Attention Hypothesis: Applying SSM with disciplined memory policy and dynamic precision improves throughput under 8...	03-09 12:09	Success	-	View
exp_self.20260309120643.030_20260309_120713 Paper: self.20260309120643.030	Section 1: README.md SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that State Space Model (SSM) strategies, specifically when combined with a disciplined memory policy and chunked recurrence, provide superior throughput under str...	03-09 12:07	Success	-	View
exp_pytrain.20260309120424.016_20260309_120506 Paper: pytrain.20260309120424.016	Strictly Typed Modular Data Pipeline README.md Title: Strictly Typed Modular Data Pipeline Design Brief Hypothesis: Utilizing Python's type hinting system (specifically Protocols and Generics) combined with strict module encapsulation practices yields code that is signific...	03-09 12:05	Success	-	View
exp_self.20260309120240.029_20260309_120304 Paper: self.20260309120240.029	SSM Strategy Stress Test README.md SSM Strategy Stress Test This benchmark evaluates the performance impact of a Disciplined Memory Policy and Dynamic Precision on State Space Models (SSMs). Hypothesis Applying SSM architectures with disciplined memory mana...	03-09 12:03	Success	-	View
exp_self.20260309120012.028_20260309_120041 Paper: self.20260309120012.028	SSM Strategy Stress Test: Memory vs. Throughput README.md SSM Strategy Stress Test: Memory vs. Throughput Overview This benchmark evaluates the hypothesis that applying State Space Model (SSM) strategies with a disciplined memory policy (specifically chunked recurrence and dynamic pr...	03-09 12:00	Success	-	View
exp_pytrain.20260309115748.015_20260309_115810 Paper: pytrain.20260309115748.015	Strict Generic Resource Registry README.md Strict Generic Resource Registry This coding drill benchmarks a robust, zero-dependency implementation of a `ResourceRegistry` leveraging PEP 695 Type Parameter Syntax (introduced in Python 3.12). Hypothesis Using PEP 695 synt...	03-09 11:58	Success	-	View
exp_self.20260309115559.027_20260309_115632 Paper: self.20260309115559.027	Self-directed benchmark: SSM Strategy Stress Test README.md Benchmark Overview This benchmark evaluates the efficiency of State Space Models (SSM) versus traditional Transformer-style Attention mechanisms when operating under strict hardware constraints (8GB VRAM). The Innovation:...	03-09 11:56	Success	-	View
exp_self.20260309115325.026_20260309_115352 Paper: self.20260309115325.026	Benchmark: SSM Strategy Stress Test README.md Benchmark: SSM Strategy Stress Test Overview This benchmark tests the hypothesis that applying a State Space Model (SSM) with a disciplined memory policy (specifically state caching) improves throughput under constrained VRAM (8GB...	03-09 11:54	Success	-	View
exp_pytrain.20260309115133.014_20260309_115152 Paper: pytrain.20260309115133.014	Python Skill Fallback Title: Dynamic Recipe Loader with Structural Typing - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 11:51	Success	-	View
exp_self.20260309114921.025_20260309_115002 Paper: self.20260309114921.025	```markdown bash python benchmark.py ```	03-09 11:50	Success	-	View
exp_self.20260309114644.024_20260309_114714 Paper: self.20260309114644.024	Benchmark: SSM Strategy Stress Test README.md Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy improves inference throughput and memory efficiency compared to stand...	03-09 11:47	Success	-	View
exp_pytrain.20260309114426.013_20260309_114456 Paper: pytrain.20260309114426.013	Strictly Typed Semantic Version Plugin Loader README.md Title: Strictly Typed Semantic Version Plugin Loader Description: This benchmark evaluates the ability to write robust, strictly typed Python code using advanced standard library features. The objective is to implement a s...	03-09 11:44	Success	-	View
exp_self.20260309114231.023_20260309_114313 Paper: self.20260309114231.023	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309114231.023 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 11:43	Success	-	View
exp_self.20260309113949.022_20260309_114021 Paper: self.20260309113949.022	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309113949.022 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 11:40	Success	-	View
exp_pytrain.20260309113709.012_20260309_113729 Paper: pytrain.20260309113709.012	AutoFactory Pattern Implementation with Strict Typing README.md AutoFactory Pattern Implementation with Strict Typing Overview This coding drill implements a robust, maintainable plugin architecture using Python's `__init_subclass__` hook and `typing.Protocol`. This design pattern mimics the r...	03-09 11:37	Success	-	View
exp_self.20260309113434.021_20260309_113502 Paper: self.20260309113434.021	SSM Strategy Stress Test README.md SSM Strategy Stress Test Overview This benchmark evaluates the Disciplined Memory Policy hypothesis for State Space Models (SSMs). It compares a naive SSM implementation (which retains extensive history/cache) against an optim...	03-09 11:35	Success	-	View
exp_self.20260309113213.020_20260309_113238 Paper: self.20260309113213.020	```markdown bash python benchmark.py ``` Expected Output The script will output VRAM usage in Megabytes (MB) and Tokens per Second (TPS) for both the Baseline and the SSM variant, followed by a verification summary.	03-09 11:32	Success	-	View
exp_pytrain.20260309112909.011_20260309_112935 Paper: pytrain.20260309112909.011	Dynamic Plugin Loader with Structural Subtyping Overview This benchmark tests a developer's ability to implement a robust, type-safe plugin system using Python's standard library. It leverages Structural Subtyping (via `typing.Protocol`) to enforce interfaces without explicit inherit...	03-09 11:29	Success	-	View
exp_self.20260309112736.019_20260309_112759 Paper: self.20260309112736.019	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309112736.019 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 11:28	Success	-	View
exp_self.20260309112510.018_20260309_112534 Paper: self.20260309112510.018	SSM Strategy Stress Test README.md SSM Strategy Stress Test Innovation: Self-directed benchmark: ssm strategy stress test Concept: State Space Models (SSM), Memory Policy, Dynamic Precision Overview This benchmark evaluates the hypothesis that applying a di...	03-09 11:25	Success	-	View
exp_pytrain.20260309112218.010_20260309_112255 Paper: pytrain.20260309112218.010	Strictly Typed Dynamic Package Generator Benchmark This benchmark tests the ability to programmatically construct a Python package structure containing strictly typed code. It verifies that the generated module can be imported dynamically and that its type hints are correctly introspected u...	03-09 11:22	Success	-	View
exp_self.20260309110955.017_20260309_111020 Paper: self.20260309110955.017	Here is the runnable benchmark design. README.md bash python benchmark.py	03-09 11:20	Success	-	View
exp_self.20260309110659.016_20260309_110729 Paper: self.20260309110659.016	--- README.md --- SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying a State Space Model (SSM) with a disciplined chunked memory policy significantly improves throughput and reduces VRAM pressure compare...	03-09 11:07	Success	-	View
exp_pytrain.20260309110415.009_20260309_110442 Paper: pytrain.20260309110415.009	Dynamic Type-Safe Plugin Loader This benchmark tests the ability of a Python system to dynamically generate code, scaffold a file system structure, and perform runtime type validation using `typing.Protocol`. Context Modern Python applications often rely on plugin archite...	03-09 11:04	Success	-	View
exp_self.20260309110228.015_20260309_110257 Paper: self.20260309110228.015	This benchmark evaluates the hypothesis that SSM (State Space Model) strategies with disciplined memory policies sig... README.md This benchmark evaluates the hypothesis that SSM (State Space Model) strategies with disciplined memory policies significantly improve throughput and reduce VRAM overhead compared to standard attention mechanisms under long-co...	03-09 11:03	Success	-	View
exp_self.20260309110018.014_20260309_110040 Paper: self.20260309110018.014	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309110018.014 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 11:00	Success	-	View
exp_pytrain.20260309105743.008_20260309_105807 Paper: pytrain.20260309105743.008	Python Skill Fallback Title: Strictly-Typed Dependency Resolver with PEP 695 - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 10:58	Success	-	View
exp_self.20260309105537.013_20260309_105624 Paper: self.20260309105537.013	Self-directed benchmark: ssm strategy stress test README.md This benchmark investigates the hypothesis that applying Selective State Space Models (SSM) with a disciplined memory policy and dynamic precision improves throughput and efficiency under strict memory constraints (8GB). **Bac...	03-09 10:56	Success	-	View
exp_self.20260309105309.012_20260309_105337 Paper: self.20260309105309.012	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309105309.012 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 10:53	Success	-	View
exp_pytrain.20260309105048.007_20260309_105125 Paper: pytrain.20260309105048.007	Python Skill Fallback Title: Dynamic Plugin Registry with Runtime Type Validation - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 10:51	Success	-	View
exp_self.20260309104839.011_20260309_104926 Paper: self.20260309104839.011	Self-directed benchmark: SSM Strategy Stress Test README.md This benchmark evaluates the hypothesis that applying a State Space Model (SSM) with a disciplined memory policy (specifically, fixed-state recurrent processing) improves inference throughput and efficiency under tight 8GB VRAM co...	03-09 10:49	Success	-	View
exp_self.20260309104554.010_20260309_104623 Paper: self.20260309104554.010	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309104554.010 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 10:46	Success	-	View
exp_pytrain.20260309104318.006_20260309_104400 Paper: pytrain.20260309104318.006	Dynamic Type-Safe Plugin Loader with Auto-Discovery README.md Dynamic Type-Safe Plugin Loader with Auto-Discovery This benchmark demonstrates a robust implementation of a dynamic plugin loading system using only the Python standard library. It simulates an environment similar to machine lear...	03-09 10:44	Success	-	View
exp_self.20260309104129.009_20260309_104211 Paper: self.20260309104129.009	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309104129.009 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 10:42	Success	-	View
exp_self.20260309103839.008_20260309_103905 Paper: self.20260309103839.008	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309103839.008 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 10:39	Success	-	View
exp_pytrain.20260309103604.005_20260309_103630 Paper: pytrain.20260309103604.005	Dynamic Module Loader with Strict Protocol Enforcement README.md Dynamic Module Loader with Strict Protocol Enforcement Overview This coding drill evaluates the implementation of a robust plugin loading system in Python. It focuses on decoupling interface definition from implementation using `t...	03-09 10:36	Success	-	View
exp_self.20260309103405.007_20260309_103439 Paper: self.20260309103405.007	SSM Strategy Stress Test: Memory vs. Throughput README.md SSM Strategy Stress Test: Memory vs. Throughput Overview This benchmark evaluates the hypothesis that applying State Space Models (SSM) with a disciplined memory policy significantly improves throughput and reduces VRAM pressu...	03-09 10:34	Success	-	View
exp_self.20260309103106.006_20260309_103151 Paper: self.20260309103106.006	README.md Self-directed benchmark: SSM Strategy Stress Test This benchmark evaluates the performance of a State Space Model (SSM) architecture, specifically focusing on the impact of a disciplined memory policy and dynamic precision on throughput and...	03-09 10:31	Success	-	View
exp_pytrain.20260309102743.004_20260309_102823 Paper: pytrain.20260309102743.004	Benchmark: Dynamic Plugin Loader with Structural Subtyping README.md Benchmark: Dynamic Plugin Loader with Structural Subtyping Overview This benchmark evaluates a Python architectural pattern combining dynamic code loading with structural subtyping (Protocols). The objective is to implement a robu...	03-09 10:28	Success	-	View
exp_self.20260309102554.005_20260309_102619 Paper: self.20260309102554.005	```markdown README.md bash pip install torch python benchmark.py ```	03-09 10:26	Success	-	View
exp_self.20260309102235.004_20260309_102312 Paper: self.20260309102235.004	SSM Strategy Stress Test: Benchmarking Memory Policy README.md SSM Strategy Stress Test: Benchmarking Memory Policy Overview This benchmark evaluates the performance of Selective State Space Models (SSM) compared to traditional Transformer architectures. Specifically, it tests the hypothe...	03-09 10:24	Success	-	View
exp_pytrain.20260309101934.003_20260309_102019 Paper: pytrain.20260309101934.003	Strict Dynamic Plugin Loader with Runtime Protocol Validation README.md Strict Dynamic Plugin Loader with Runtime Protocol Validation Overview This benchmark evaluates the design of a robust runtime plugin loader that simulates package structures using `types` and `sys` standard library modules. It en...	03-09 10:20	Success	-	View
exp_self.20260309101708.003_20260309_101748 Paper: self.20260309101708.003	```markdown bash python benchmark.py	03-09 10:18	Success	-	View
exp_self.20260309101438.002_20260309_101507 Paper: self.20260309101438.002	Self-directed benchmark: SSM Strategy Stress Test README.md Self-directed benchmark: SSM Strategy Stress Test Hypothesis Applying a State Space Model (SSM) approach with a disciplined memory policy (simulating selective state retention and chunked processing) improves inference throughput...	03-09 10:15	Success	-	View
exp_pytrain.20260309101133.002_20260309_101209 Paper: pytrain.20260309101133.002	Typed Configuration Validator using PEP 695 README.md Typed Configuration Validator using PEP 695 This benchmark demonstrates the usage of Python 3.12's Type Parameter Syntax (PEP 695) to create a robust, zero-dependency configuration validation micro-library. Features - **Generic Cl...	03-09 10:12	Success	-	View
exp_self.20260309100716.001_20260309_100754 Paper: self.20260309100716.001	Here is the runnable benchmark for the SSM strategy stress test. bash python benchmark.py markdown	03-09 10:10	Success	-	View
exp_pytrain.20260309100256.001_20260309_100328 Paper: pytrain.20260309100256.001	This benchmark evaluates the efficiency and robustness of a dynamic plugin loading system built using Python's `typing.P... README.md This benchmark evaluates the efficiency and robustness of a dynamic plugin loading system built using Python's `typing.Protocol` for structural subtyping. Objective: The goal is to simulate a "plugin manager" that dynamically...	03-09 10:03	Success	-	View
exp_self.20260309090324.030_20260309_090353 Paper: self.20260309090324.030	Self-Directed Benchmark: SSM Strategy Stress Test README.md Self-Directed Benchmark: SSM Strategy Stress Test Overview This benchmark evaluates the hypothesis that applying State Space Models (SSMs) with a disciplined memory policy improves throughput under strict 8GB VRAM constraints. Hyp...	03-09 09:03	Pending	-	View
exp_pytrain.20260309090036.017_20260309_090117 Paper: pytrain.20260309090036.017	Typed ZipApp Generator README.md Title: Typed ZipApp Generator Overview This benchmark evaluates a Python system's ability to dynamically generate, package, and verify a typed command-line application using only the standard library. Design Goals 1. **Dependency-...	03-09 09:01	Success	-	View
exp_self.20260309085725.029_20260309_085851 Paper: self.20260309085725.029	Self-directed benchmark: ssm strategy stress test README.md --- SSM Strategy Stress Test Benchmark Overview This benchmark evaluates the SSM Strategy against a standard Attention Baseline (Transformer) to validate the hypothesis: *applying ssm with disciplined memory policy improve...	03-09 08:58	Success	-	View
exp_pytrain.20260309085410.016_20260309_085447 Paper: pytrain.20260309085410.016	Typed Dependency Injection Container Benchmark README.md Title: Typed Dependency Injection Container Benchmark Design Brief This benchmark validates the hypothesis that Strict type hinting and Protocol-based design allow for the creation of robust dependency injection (DI) mechanism...	03-09 08:54	Success	-	View
exp_self.20260309085201.028_20260309_085236 Paper: self.20260309085201.028	Self-directed benchmark: ssm strategy stress test Paper ID: self.20260309085201.028 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered...	03-09 08:52	Success	-	View
exp_self.20260309084855.027_20260309_084930 Paper: self.20260309084855.027	Here is the design for the SSM Strategy Stress Test benchmark. No summary available yet.	03-09 08:49	Success	-	View
exp_pytrain.20260309084648.015_20260309_084711 Paper: pytrain.20260309084648.015	Dynamic Module Injection and Strict Protocol Validation README.md Dynamic Module Injection and Strict Protocol Validation Overview This benchmark evaluates a system's ability to simulate a Python packaging environment by dynamically generating, compiling, and injecting modules into `sys.modules`...	03-09 08:47	Success	-	View
exp_self.20260309084435.026_20260309_084501 Paper: self.20260309084435.026	SSM Strategy Stress Test README.md SSM Strategy Stress Test Innovation: Self-directed benchmark: ssm strategy stress test Hypothesis: Applying SSM with a disciplined memory policy improves throughput under 8GB constraints. Description This benchmark compare...	03-09 08:45	Success	-	View
exp_self.20260309084211.025_20260309_084235 Paper: self.20260309084211.025	Section 1: README.md SSM Strategy Stress Test This benchmark evaluates the hypothesis that State Space Models (SSM), when combined with disciplined memory policies (specifically state reduction and dynamic precision), offer superior throughput and memory ef...	03-09 08:42	Success	-	View
exp_pytrain.20260309083929.014_20260309_084005 Paper: pytrain.20260309083929.014	Generic Plugin Registry & Factory Benchmark README.md Generic Plugin Registry & Factory Benchmark Overview This benchmark simulates a core component of large-scale AI frameworks like LitGPT: a modular, type-safe plugin system. It challenges the implementation to utilize Python's adva...	03-09 08:40	Success	-	View
exp_self.20260309083735.024_20260309_083813 Paper: self.20260309083735.024	Section 1: README.md Section 2: benchmark.py	03-09 08:38	Success	-	View
exp_self.20260309083456.023_20260309_083549 Paper: self.20260309083456.023	Logit-Gated State Skipping Benchmark README.md Logit-Gated State Skipping Benchmark Overview This benchmark tests the Logit-Gated State Skipping hypothesis on a simplified State Space Model (SSM). The core idea is to reduce computational overhead by skipping the state upda...	03-09 08:35	Success	-	View
exp_pytrain.20260309083231.013_20260309_083303 Paper: pytrain.20260309083231.013	Virtual Package Dispatcher with Protocol Validation README.md Virtual Package Dispatcher with Protocol Validation Design Brief Hypothesis: An autonomous coding system can simulate a complex package ecosystem by generating virtual modules in-memory, validating them against strict runtime...	03-09 08:33	Success	-	View
exp_self.20260309082955.022_20260309_083030 Paper: self.20260309082955.022	Gated Linear Attention (GLA) to SSM Bridge: Innovation Benchmark README.md Gated Linear Attention (GLA) to SSM Bridge: Innovation Benchmark Hypothesis Gated Linear Attention (GLA) and State Space Models (SSMs) share fundamental mathematical properties as linear recurrent systems. This benchmark tests the...	03-09 08:30	Success	-	View
exp_self.20260309082749.021_20260309_082812 Paper: self.20260309082749.021	Benchmark: Delta-State Quantization (DSQ) for SSMs README.md Benchmark: Delta-State Quantization (DSQ) for SSMs Overview This benchmark evaluates Delta-State Quantization (DSQ), a technique designed to improve the efficiency of State Space Models (SSMs) like Mamba. The Innovation: S...	03-09 08:28	Success	-	View
exp_pytrain.20260309082605.012_20260309_082643 Paper: pytrain.20260309082605.012	```markdown README.md bash python benchmark.py	03-09 08:26	Success	-	View
exp_oa_W7131910431_20260309_082329 Paper: oa_W7131910431	SideQuest: Model-Driven KV Cache Management Benchmark README.md SideQuest: Model-Driven KV Cache Management Benchmark This repository contains a benchmark designed to evaluate SideQuest, a novel approach to KV cache management for long-horizon agentic reasoning. Overview Large Language Mod...	03-09 08:23	Success	-	View
exp_self.20260309082110.020_20260309_082157 Paper: self.20260309082110.020	This benchmark evaluates the efficacy of Frequency-Domain State Compression for State Space Models (SSMs). README.md This benchmark evaluates the efficacy of Frequency-Domain State Compression for State Space Models (SSMs). Concept Standard SSMs (like Mamba) maintain a large hidden state vector $h_t$ that evolves over time. This hidden state...	03-09 08:22	Success	-	View
exp_pytrain.20260309081908.011_20260309_081948 Paper: pytrain.20260309081908.011	This benchmark verifies the ability to construct a robust, dynamic plugin loading system using Python's standard library... README.md This benchmark verifies the ability to construct a robust, dynamic plugin loading system using Python's standard library. It tests the candidate's understanding of `importlib`, `typing.Protocol`, and exception handling within a fi...	03-09 08:19	Success	-	View
exp_self.20260309081640.019_20260309_081731 Paper: self.20260309081640.019	Pinned-Window 4-bit State Streaming Paper ID: self.20260309081640.019 - Hypothesis: Standard VRAM overflow crashes training. By implementing a ring-buffer in pinned CPU memory and syncing only the active state window in FP16 to GPU, we can train on infinite sequences. - Plan:...	03-09 08:17	Success	-	View
exp_self.20260309081427.018_20260309_081459 Paper: self.20260309081427.018	CPU-Pinned State Recycle Cache Benchmark README.md CPU-Pinned State Recycle Cache Benchmark This benchmark tests the CPU-Pinned State Recycle Cache innovation designed for SSM/Mamba architectures running on memory-constrained GPUs. The Innovation Standard SSM blocks maintain t...	03-09 08:15	Success	-	View
exp_pytrain.20260309081208.010_20260309_081313 Paper: pytrain.20260309081208.010	Modular Asynchronous Log Processor README.md Modular Asynchronous Log Processor Overview This benchmark verifies the structural integrity, type safety, and performance of a modular asynchronous log processing system. It simulates a "drill" where a library component `async_pr...	03-09 08:13	Success	-	View
exp_self.20260309081004.017_20260309_081034 Paper: self.20260309081004.017	Dynamic State Quantization for SSMs README.md Dynamic State Quantization for SSMs Overview This benchmark evaluates a dynamic precision mechanism for State Space Models (SSMs). The innovation implements a "State Quantizer" that monitors the magnitude of state deltas ($\Delta...	03-09 08:10	Success	-	View
exp_self.20260309080802.016_20260309_080831 Paper: self.20260309080802.016	Magnitude-Adaptive State Quantization (MASQ) Paper ID: self.20260309080802.016 - Hypothesis: Using a hebbian-like gating mechanism to detect 'high energy' state updates and keeping those in FP16, while quantizing 'low energy' updates to INT4, will preserve model stability. - Plan: Mod...	03-09 08:08	Success	-	View
exp_pytrain.20260309080532.009_20260309_080607 Paper: pytrain.20260309080532.009	Benchmark: Strictly Typed Dynamic Plugin Loader README.md Benchmark: Strictly Typed Dynamic Plugin Loader Overview This benchmark evaluates the ability of a Python system to construct a robust, dependency-free plugin loading mechanism. It demonstrates the synergy between Python's `typing...	03-09 08:06	Success	-	View
exp_self.20260309080313.015_20260309_080355 Paper: self.20260309080313.015	Section 1: README.md Latency-Aware State Tiering (LAST) Benchmark Overview This benchmark evaluates the Latency-Aware State Tiering (LAST) hypothesis. The core idea is that in State Space Models (SSMs) or RNNs, not all hidden states in a large batch are act...	03-09 08:04	Success	-	View
exp_self.20260309080044.014_20260309_080126 Paper: self.20260309080044.014	Associative State Injection (ASI) Layer Benchmark README.md Associative State Injection (ASI) Layer Benchmark Overview This benchmark implements and evaluates the Associative State Injection (ASI) layer innovation. ASI augments standard State Space Models (SSMs) with a cross-attention...	03-09 08:01	Success	-	View
exp_pytrain.20260309075851.008_20260309_075915 Paper: pytrain.20260309075851.008	Strict Package Metadata Validator README.md Strict Package Metadata Validator Overview This benchmark tests the implementation of a strict package metadata validator using Python's `typing` module (specifically `TypedDict`) and the `re` module for regex-based validation. Ob...	03-09 07:59	Success	-	View
exp_self.20260309074140.013_20260309_074221 Paper: self.20260309074140.013	This benchmark implements Adaptive Dimension-Wise State Quantization (ADWSQ). README.md This benchmark implements Adaptive Dimension-Wise State Quantization (ADWSQ). This experiment tests the hypothesis that high-variance dimensions in State Space Model (SSM) hidden states carry more information and thus require...	03-09 07:57	Success	-	View
exp_self.20260309073923.012_20260309_074015 Paper: self.20260309073923.012	Per-Channel Dynamic State Precision (PC-DSP) Benchmark This benchmark evaluates a novel optimization technique for State Space Models (SSMs) and RNNs, specifically targeting the memory footprint of the recurrent state cache. Hypothesis In sequence modeling, the hidden state acts as a memory. We...	03-09 07:40	Success	-	View
exp_pytrain.20260309073720.007_20260309_073800 Paper: pytrain.20260309073720.007	```markdown README.md bash python benchmark.py ```	03-09 07:38	Success	-	View
exp_self.20260309073417.011_20260309_073526 Paper: self.20260309073417.011	Hybrid CPU-GPU State Streaming (HCGS) Benchmark README.md Hybrid CPU-GPU State Streaming (HCGS) Benchmark Overview This benchmark validates the Hybrid CPU-GPU State Streaming (HCGS) hypothesis. It aims to demonstrate that by overlapping GPU computation of SSM (State Space Model) step...	03-09 07:35	Success	-	View
exp_self.20260309073159.010_20260309_073250 Paper: self.20260309073159.010	Interpolated State Buffering (ISB) Paper ID: self.20260309073159.010 - Hypothesis: SSM states change smoothly. We can compute the state every N steps, and for the intermediate steps, linearly interpolate between the last two checkpoints. This reduces memory bandwidth pressur...	03-09 07:32	Success	-	View
exp_pytrain.20260309072934.006_20260309_072959 Paper: pytrain.20260309072934.006	Benchmark: Dynamic Plugin Loader with Strict Type Verification README.md Benchmark: Dynamic Plugin Loader with Strict Type Verification Hypothesis An autonomous system can robustly manage modular code architectures by implementing a custom dynamic import system. This system enforces interface complianc...	03-09 07:30	Success	-	View
exp_self.20260309072744.009_20260309_072809 Paper: self.20260309072744.009	Recency-Biased Dynamic Precision (RBDP) Benchmark This benchmark demonstrates the Recency-Biased Dynamic Precision (RBDP) innovation. It simulates a State Space Model (SSM) processing a long sequence. The core hypothesis is that recent SSM states require high precision (FP16), while ol...	03-09 07:28	Success	-	View
exp_self.20260309072529.008_20260309_072601 Paper: self.20260309072529.008	Here is the runnable benchmark for the Tiered State Precision (TSP) innovation. README.md Tiered State Precision (TSP) Benchmark Hypothesis: The SSM hidden state is non-uniform; the first half (recent history) requires FP16, while the second half (long-term history) can be quantized to FP8 without significant degra...	03-09 07:26	Success	-	View
exp_pytrain.20260309072338.005_20260309_072358 Paper: pytrain.20260309072338.005	Dynamic Module Injection with Strict Protocol Validation README.md Dynamic Module Injection with Strict Protocol Validation This benchmark evaluates the capability of an autonomous coding system to implement a robust, modular plugin architecture using Python's standard library. The test focuses o...	03-09 07:24	Success	-	View
exp_self.20260309072154.007_20260309_072223 Paper: self.20260309072154.007	Entropy-Modulated Spectral State Pruning (EMSSP) README.md Entropy-Modulated Spectral State Pruning (EMSSP) Overview This benchmark implements the EMSSP innovation for State Space Models (SSMs). It tests the hypothesis that high-entropy tokens correspond to high-frequency components i...	03-09 07:22	Success	-	View
exp_self.20260309071924.006_20260309_072002 Paper: self.20260309071924.006	--- README.md --- Quantized Snapshot Recycling (QSR) Benchmark This repository contains a micro-benchmark designed to validate the Quantized Snapshot Recycling (QSR) hypothesis. Hypothesis SSM (State Space Model) states are deterministic. B...	03-09 07:20	Success	-	View
exp_pytrain.20260309071731.004_20260309_071806 Paper: pytrain.20260309071731.004	Strictly Typed CLI Data Processor README.md Strictly Typed CLI Data Processor This benchmark evaluates the ability to generate a robust, single-file Python CLI tool that enforces strict static typing using `typing` protocols and generics, while adhering to PEP 8 standards....	03-09 07:18	Success	-	View
exp_self.20260309071516.005_20260309_071622 Paper: self.20260309071516.005	Entropy-Gated Spectral Cache (EGSC) Benchmark README.md Entropy-Gated Spectral Cache (EGSC) Benchmark Overview This benchmark validates the Entropy-Gated Spectral Cache (EGSC) hypothesis. It posits that High-entropy states in a language model carry more information and require high...	03-09 07:16	Success	-	View
exp_self.20260309071239.004_20260309_071333 Paper: self.20260309071239.004	Hybrid-Precision Asynchronous State Offloading (HP-ASO) Benchmark README.md Hybrid-Precision Asynchronous State Offloading (HP-ASO) Benchmark Overview This benchmark evaluates HP-ASO, a memory management strategy designed to extend the context window of State Space Models (SSMs), such as Mamba. The co...	03-09 07:13	Success	-	View
exp_pytrain.20260309071029.003_20260309_071109 Paper: pytrain.20260309071029.003	--- README.md --- Coding Drill Benchmark: Typed ZipApp Package Factory Overview This benchmark evaluates an agent's ability to programmatically construct a Python package structure, enforce strict static typing using advanced standard library c...	03-09 07:11	Success	-	View
exp_hf_2603.01666_20260309_070832 Paper: hf_2603.01666	Beyond the Grid: Layout-Informed Multi-Vector Retrieval with Parsed Visual Document Representations Paper ID: hf_2603.01666 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	03-09 07:08	Success	-	View
exp_self.20260309070552.003_20260309_070635 Paper: self.20260309070552.003	Correction-Buffered State Streaming Paper ID: self.20260309070552.003 - Hypothesis: We keep the main SSM state in 4-bit (on CPU or disk). We maintain a tiny (e.g., 1%) 8-bit 'correction cache' in VRAM that stores the error between the 4-bit approx and the true state. - Plan:...	03-09 07:06	Success	-	View
exp_pytrain.20260309070422.002_20260309_070444 Paper: pytrain.20260309070422.002	Type-Safe Dynamic Plugin Loader with PEP 695 Overview This coding drill demonstrates the implementation of a Type-Safe Dynamic Plugin Loader using PEP 695 (Type Parameter Syntax) introduced in Python 3.12. The objective is to modernize generic wrapper classes—commonly found in...	03-09 07:04	Success	-	View
exp_self.20260309070151.002_20260309_070222 Paper: self.20260309070151.002	Entropy-Gated State Speculative Decoding Paper ID: self.20260309070151.002 - Hypothesis: High entropy tokens carry more information and require higher state fidelity. Low entropy tokens (tokens, stop words) can be processed with 4-bit states. This dynamic switching will reduce ave...	03-09 07:02	Success	-	View
exp_self.20260309065920.001_20260309_065956 Paper: self.20260309065920.001	Here is the runnable benchmark design for the Tiered Precision State Cache (TPSC) innovation. No summary available yet.	03-09 07:00	Success	-	View
exp_pytrain.20260309065752.001_20260309_065814 Paper: pytrain.20260309065752.001	Title: Structurally Typed Dynamic Plugin Loader README.md Title: Structurally Typed Dynamic Plugin Loader Description: This benchmark evaluates a system's ability to manage dynamic code loading and structural type validation without external dependencies. It tests the creation of...	03-09 06:58	Success	-	View
exp_pytrain.20260309064248.002_20260309_064327 Paper: pytrain.20260309064248.002	PEP 695 Generic Storage and Packaging Drill README.md PEP 695 Generic Storage and Packaging Drill Objective This benchmark validates the implementation of Python 3.12+ `PEP 695` Type Parameter Syntax. It requires the creation of a generic class `Storage[T]` and a generic function...	03-09 06:43	Success	-	View
exp_self.20260309064035.002_20260309_064116 Paper: self.20260309064035.002	```markdown bash python benchmark.py	03-09 06:41	Success	-	View
exp_self.20260309063822.001_20260309_063908 Paper: self.20260309063822.001	ARES: SSM + Cache + Dynamic Precision Benchmark README.md ARES: SSM + Cache + Dynamic Precision Benchmark This benchmark tests the hypothesis that combining State Space Models (SSM), efficient Caching, and Dynamic Precision improves memory efficiency and throughput compared t...	03-09 06:39	Success	-	View
exp_pytrain.20260309063620.001_20260309_063710 Paper: pytrain.20260309063620.001	Protocol-Based Dynamic Plugin Registry README.md Protocol-Based Dynamic Plugin Registry Overview This benchmark demonstrates a robust, structural subtyping-based plugin system using Python's `typing.Protocol`. Unlike traditional inheritance-based plugin architectures (Abstract B...	03-09 06:37	Success	-	View
exp_pytrain.20260309062914.003_20260309_062946 Paper: pytrain.20260309062914.003	Dynamic Plugin Loader with Runtime Type Validation Overview This benchmark tests the ability to construct a flexible, type-safe plugin architecture using Python's standard library. It simulates a dynamic package environment where modules are created in-memory, loaded via `importlib`, and va...	03-09 06:29	Success	-	View
exp_hf_2603.05438_20260309_062747 Paper: hf_2603.05438	Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model Paper ID: hf_2603.05438 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	03-09 06:27	Success	-	View
exp_self.20260309062433.002_20260309_062523 Paper: self.20260309062433.002	Low-Rank Associative State Injection (LASI) Paper ID: self.20260309062433.002 - Hypothesis: SSMs are theoretically limited to finite memory. By maintaining a small (low-rank) 'global context' matrix updated via linear attention (which is O(N) and fits in cache) and injecting it into...	03-09 06:25	Success	-	View
exp_pytrain.20260309062235.002_20260309_062308 Paper: pytrain.20260309062235.002	Generic Component Registry using PEP 695 This benchmark demonstrates the use of Python 3.12's PEP 695 Type Parameter Syntax to create a generic `ComponentRegistry` class. It validates that the new syntax reduces boilerplate (removing the need for explicit `Generic` inheritance...	03-09 06:23	Success	-	View
exp_self.20260309062030.001_20260309_062105 Paper: self.20260309062030.001	Entropy-Gated Dynamic Precision (EGDP) for SSMs README.md Entropy-Gated Dynamic Precision (EGDP) for SSMs Overview This benchmark evaluates the Entropy-Gated Dynamic Precision (EGDP) innovation applied to Mamba-style State Space Models (SSMs). Hypothesis Tokens with high entropy (hig...	03-09 06:21	Success	-	View
exp_hf_2603.06331_20260309_061853 Paper: hf_2603.06331	WorldCache: Benchmarking Heterogeneous Token Caching README.md WorldCache: Benchmarking Heterogeneous Token Caching This benchmark demonstrates the performance gains of WorldCache, a framework designed to accelerate diffusion-based world models. The Innovation Standard diffusion models ap...	03-09 06:18	Success	-	View
exp_pytrain.20260309061616.001_20260309_061655 Paper: pytrain.20260309061616.001	Python Skill Fallback Title: Type-Safe Dynamic Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 06:17	Success	-	View
exp_self.20260309025539.108_20260309_025736 Paper: self.20260309025539.108	Entropy-Gated Dynamic State Quantization (EG-DSQ) README.md Entropy-Gated Dynamic State Quantization (EG-DSQ) Overview This benchmark evaluates the Entropy-Gated Dynamic State Quantization (EG-DSQ) innovation applied to a State Space Model (SSM). The Innovation Standard SSMs (like Mamb...	03-09 03:01	Success	-	View
exp_pytrain.20260309025116.060_20260309_025205 Paper: pytrain.20260309025116.060	Robust Dynamic Plugin System using Protocols and Importlib README.md This benchmark evaluates a Python system's ability to dynamically construct a package structure, generate source code on-the-fly, and validate loaded modules against strict `typing.Protocol` interfaces. Objective To demonstrate ma...	03-09 02:52	Success	-	View
exp_self.20260309024727.107_20260309_024843 Paper: self.20260309024727.107	Gated State Quantization (GSQ) Paper ID: self.20260309024727.107 - Hypothesis: When the SSM gate is 'closed' (retaining old memory), the state is static and can be aggressively quantized (int8). When the gate is 'open' (absorbing new info), we temporarily switch to high...	03-09 02:48	Success	-	View
exp_pytrain.20260309024255.059_20260309_024356 Paper: pytrain.20260309024255.059	Benchmark: Auto-Registering Component System with Typed Configurations README.md Benchmark: Auto-Registering Component System with Typed Configurations Objective This benchmark tests your ability to design a robust, declarative plugin architecture using advanced Python metaprogramming features and static type...	03-09 02:43	Success	-	View
exp_pytrain.20260309023336.058_20260309_023534 Paper: pytrain.20260309023336.058	Python Skill Fallback Title: Dynamic Plugin Loader with Runtime Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 02:35	Success	-	View
exp_self.20260309022924.106_20260309_023049 Paper: self.20260309022924.106	Spectral State Cache (SSC) Benchmark README.md Spectral State Cache (SSC) Benchmark This benchmark evaluates the Spectral State Cache innovation, which applies frequency-domain decomposition (DCT/FFT) to the recurrent states of State Space Models (SSMs). Hypothesis The rec...	03-09 02:30	Success	-	View
exp_pytrain.20260309022456.057_20260309_022547 Paper: pytrain.20260309022456.057	Benchmark: Strict Typed Plugin System with Namespace Control README.md Benchmark: Strict Typed Plugin System with Namespace Control Objective This benchmark validates the implementation of a strictly typed, extensible plugin system using Python's `typing.Protocol` and explicit namespace management vi...	03-09 02:25	Success	-	View
exp_self.20260309022045.105_20260309_022212 Paper: self.20260309022045.105	Asynchronous State Offloading (ASO) Benchmark This repository contains a minimal, runnable benchmark designed to test the Asynchronous State Offloading (ASO) hypothesis. The Hypothesis In State Space Models (SSMs) like Mamba, managing the recurrent state during long-context generat...	03-09 02:22	Success	-	View
exp_pytrain.20260309021757.056_20260309_021840 Paper: pytrain.20260309021757.056	Strictly-Typed Dynamic Plugin Loader Overview This benchmark evaluates the ability to construct a robust, dynamic plugin loading system using Python's standard library. It focuses on the combination of `importlib` for dynamic runtime loading and `typing.Protocol` for strict st...	03-09 02:18	Success	-	View
exp_self.20260309021514.104_20260309_021552 Paper: self.20260309021514.104	Student hypothesis: dynamic_precision + ssm_mamba co-design Paper ID: self.20260309021514.104 - Hypothesis: Combining dynamic_precision + ssm_mamba + memory will improve throughput or memory efficiency without breaking 8GB execution. - Plan: Create a compact comparative benchmark against a simple ba...	03-09 02:15	Success	-	View
exp_self.20260309021223.103_20260309_021308 Paper: self.20260309021223.103	Benchmark: SSM + Cache Co-design with Dynamic Precision README.md Benchmark: SSM + Cache Co-design with Dynamic Precision Hypothesis This benchmark explores the Student Hypothesis: Integrating State Space Models (SSM), efficient State Caching, and Dynamic Precision (Mixed Precision) in a co-...	03-09 02:13	Success	-	View
exp_pytrain.20260309021023.055_20260309_021047 Paper: pytrain.20260309021023.055	Benchmark: Type-Safe Plugin Registry with PEP 695 README.md Benchmark: Type-Safe Plugin Registry with PEP 695 Overview This benchmark validates the use of Python 3.12+'s PEP 695 Type Parameter Syntax to create a generic, type-safe Plugin Registry. It ensures that the new syntax reduces boi...	03-09 02:10	Success	-	View
exp_self.20260309020620.102_20260309_020743 Paper: self.20260309020620.102	Section 1: README.md bash pip install torch python benchmark.py	03-09 02:07	Success	-	View
exp_self.20260309020325.101_20260309_020424 Paper: self.20260309020325.101	Asynchronous State Recycle Cache (ASRC) Benchmark README.md Asynchronous State Recycle Cache (ASRC) Benchmark This repository contains a benchmark designed to test the Asynchronous State Recycle Cache (ASRC) innovation. The hypothesis is that by offloading SSM (State Space Model) state...	03-09 02:04	Success	-	View
exp_pytrain.20260309020023.054_20260309_020117 Paper: pytrain.20260309020023.054	```markdown README.md bash python benchmark.py ---	03-09 02:01	Success	-	View
exp_self.20260309015635.100_20260309_015742 Paper: self.20260309015635.100	Innovation: Temporal Delta State Quantization README.md Innovation: Temporal Delta State Quantization Overview This benchmark validates the Temporal Delta State Quantization technique applied to State Space Models (SSMs). Hypothesis: SSM states evolve smoothly over time. The di...	03-09 01:57	Success	-	View
exp_pytrain.20260309015345.053_20260309_015434 Paper: pytrain.20260309015345.053	Benchmark: Dynamic Backend Registry with Protocol Enforcement README.md Benchmark: Dynamic Backend Registry with Protocol Enforcement Title: Dynamic Backend Registry with Protocol Enforcement Focus: `typing.Protocol`, `importlib`, dynamic plugin discovery. Execution Time: < 20 seconds. Obj...	03-09 01:54	Success	-	View
exp_self.20260309015057.099_20260309_015157 Paper: self.20260309015057.099	Pinned-State Quantization Buffer (PSQB) Benchmark README.md Pinned-State Quantization Buffer (PSQB) Benchmark This repository contains the benchmark code for the Pinned-State Quantization Buffer (PSQB) innovation. Hypothesis For State Space Models (SSMs) like Mamba, the recurrent state...	03-09 01:52	Success	-	View
exp_self.20260309014812.098_20260309_014905 Paper: self.20260309014812.098	Spectral State Denoising (SSD) Benchmark README.md Spectral State Denoising (SSD) Benchmark This benchmark evaluates the hypothesis that recurrent hidden states in State Space Models (SSMs) contain high-frequency noise that can be discarded to improve memory efficiency. The Innova...	03-09 01:49	Success	-	View
exp_pytrain.20260309014459.052_20260309_014557 Paper: pytrain.20260309014459.052	Strictly-Typed Component Registry System Overview This benchmark demonstrates a strictly-typed `Registry` pattern implementation using Python's standard `typing` module. It mimics the behavior of modern ML frameworks (like Hugging Face Transformers or Diffusers) where components a...	03-09 01:46	Success	-	View
exp_self.20260309014142.097_20260309_014230 Paper: self.20260309014142.097	Linear-Mamba Kernel Fusion (LMKF) Benchmark README.md Linear-Mamba Kernel Fusion (LMKF) Benchmark Overview This benchmark validates the Linear-Mamba Kernel Fusion (LMKF) hypothesis: that a hybrid inference engine can switch between an optimized SSM (Mamba-style) execution path an...	03-09 01:42	Success	-	View
exp_self.20260309013907.096_20260309_013955 Paper: self.20260309013907.096	Entropy-Gated Dynamic State Quantization Benchmark README.md Entropy-Gated Dynamic State Quantization Benchmark This benchmark evaluates a novel optimization for State Space Models (SSMs) where the precision of the hidden state is dynamically adjusted based on the information entropy of the...	03-09 01:40	Success	-	View
exp_pytrain.20260309013625.051_20260309_013708 Paper: pytrain.20260309013625.051	```markdown README.md bash python benchmark.py	03-09 01:37	Success	-	View
exp_self.20260309013215.095_20260309_013322 Paper: self.20260309013215.095	Entropy-Triggered CPU Offload (ETCO) Overview This benchmark tests the Entropy-Triggered CPU Offload (ETCO) strategy applied to State Space Models (SSMs). The core hypothesis is that the internal state `h` of an SSM acts as a compressive history. During fluent generation (...	03-09 01:33	Success	-	View
exp_pytrain.20260309012927.050_20260309_013026 Paper: pytrain.20260309012927.050	Coding Drill: Asynchronous Typed Module Pattern README.md Coding Drill: Asynchronous Typed Module Pattern Objective This benchmark evaluates the ability to design and verify a robust, single-file Python module that adheres to modern packaging and typing standards. The drill requires gene...	03-09 01:30	Success	-	View
exp_self.20260309012628.094_20260309_012724 Paper: self.20260309012628.094	This benchmark evaluates the Frequency-Domain State Offloading technique for State Space Models (SSMs). README.md This benchmark evaluates the Frequency-Domain State Offloading technique for State Space Models (SSMs). Concept Standard SSM implementations maintain a recurrent state tensor on the GPU to avoid slow PCIe transfers. This limit...	03-09 01:27	Success	-	View
exp_self.20260309012325.093_20260309_012408 Paper: self.20260309012325.093	Entropy-Adaptive State Quantization (EASQ) Paper ID: self.20260309012325.093 - Hypothesis: High-entropy inputs require full FP16 state precision to maintain gradients, while low-entropy inputs can safely use INT4 states, reducing VRAM pressure by 30%. - Plan: Implement a wrapper for...	03-09 01:24	Success	-	View
exp_pytrain.20260309012015.049_20260309_012146 Paper: pytrain.20260309012015.049	Dynamic Module Validator with TypeGuards README.md Dynamic Module Validator with TypeGuards Overview This coding drill demonstrates a robust approach to runtime type safety in Python plugin systems. It simulates a scenario where an application must dynamically load a module from a...	03-09 01:21	Success	-	View
exp_self.20260309011711.092_20260309_011815 Paper: self.20260309011711.092	--- Student hypothesis: ssm + cache + dynamic_precision Hypothesis Combining `ssm` + `cache` + `dynamic_precision` will improve throughput or memory efficiency without breaking 8GB execution. Plan Create a compact comparative benchmark against...	03-09 01:18	Success	-	View
exp_self.20260309011422.091_20260309_011508 Paper: self.20260309011422.091	Sliding-Window Linear SSM Bridge Paper ID: self.20260309011422.091 - Hypothesis: SSMs fail at precise retrieval because of state compression. A sliding window attention layer (Linear Attention) applied to the raw recent tokens will boost retrieval accuracy without quadrati...	03-09 01:15	Success	-	View
exp_pytrain.20260309011242.048_20260309_011314 Paper: pytrain.20260309011242.048	```markdown README.md bash python benchmark.py	03-09 01:13	Success	-	View
exp_self.20260309011015.090_20260309_011114 Paper: self.20260309011015.090	Salience-Adaptive Mixed-Precision States (SAMP-S) Innovation This benchmark introduces Salience-Adaptive Mixed-Precision States, a compression technique for State Space Models (SSMs). Standard SSMs maintain large recurrent states (e.g., in Mamba architectures) entirely in FP16. We hypo...	03-09 01:11	Success	-	View
exp_self.20260309010812.089_20260309_010843 Paper: self.20260309010812.089	Student hypothesis: ssm + cache co-design Paper ID: self.20260309010812.089 - Hypothesis: Combining ssm + cache + dynamic_precision will improve throughput or memory efficiency without breaking 8GB execution. - Plan: Create a compact comparative benchmark against a simple baseline,...	03-09 01:08	Success	-	View
exp_pytrain.20260309010534.047_20260309_010619 Paper: pytrain.20260309010534.047	Dynamic Plugin Registry with Runtime Type Enforcement README.md Title: Dynamic Plugin Registry with Runtime Type Enforcement Overview This benchmark tests a system's ability to create a robust, dynamic module loader that utilizes Python's `importlib` to discover user-defined packages within a...	03-09 01:06	Success	-	View
exp_self.20260309010152.088_20260309_010330 Paper: self.20260309010152.088	Benchmark: Pipeline-Asynchronous State Offload (PASO) README.md Benchmark: Pipeline-Asynchronous State Offload (PASO) Overview This benchmark tests the PASO innovation, designed to handle infinite-length context sequences on limited GPU VRAM (e.g., 8GB) by offloading SSM (State Space Model...	03-09 01:03	Success	-	View
exp_pytrain.20260309005912.046_20260309_005954 Paper: pytrain.20260309005912.046	Benchmark: Strictly-Typed Plugin Registry with Metadata Introspection README.md Benchmark: Strictly-Typed Plugin Registry with Metadata Introspection Overview This benchmark tests the ability to implement a robust, type-safe plugin system using Python's standard library. The core hypothesis is that `typing.Pr...	03-09 00:59	Success	-	View
exp_self.20260309005654.087_20260309_005730 Paper: self.20260309005654.087	Paged-Scan State Memory (PSSM) Benchmark This benchmark demonstrates the Paged-Scan State Memory (PSSM) concept, an optimization designed to overcome GPU VRAM limitations when processing long-context sequences in State Space Models (SSMs) like Mamba. The Innovation: Paged-Scan...	03-09 00:57	Success	-	View
exp_self.20260309005456.086_20260309_005542 Paper: self.20260309005456.086	Here is the runnable benchmark for the Modular State Experts (MoE-State) innovation. README.md	03-09 00:55	Success	-	View
exp_pytrain.20260309005252.045_20260309_005312 Paper: pytrain.20260309005252.045	```markdown bash mypy --strict benchmark.py bash python benchmark.py ``` Expected: `VERIFIED: PASSED` along with performance metrics. Acceptance Criteria - Typing: Implements `Plugin` Protocol and `PluginRegistry` using `typing.Protocol`, `typing...	03-09 00:53	Success	-	View
exp_self.20260309005035.085_20260309_005135 Paper: self.20260309005035.085	Exponential Temporal Quantization (ETQ) Paper ID: self.20260309005035.085 - Hypothesis: Recent state information requires FP16, but historical state (older than 1k tokens) can be stored in INT4 or FP8 without performance loss, exponentially decaying precision over time. - Plan: M...	03-09 00:51	Success	-	View
exp_self.20260309004838.084_20260309_004923 Paper: self.20260309004838.084	Progressive-Precision State Quantization (PPSQ) README.md Progressive-Precision State Quantization (PPSQ) Overview PPSQ is a memory optimization technique for State Space Models (SSMs) inspired by the concept of "Dynamic Precision". The core hypothesis is that the sensitivity of the...	03-09 00:49	Success	-	View
exp_pytrain.20260309004605.044_20260309_004641 Paper: pytrain.20260309004605.044	Python Skill Fallback Title: Dynamic Typed CLI Dispatcher - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 00:46	Success	-	View
exp_self.20260309004425.083_20260309_004458 Paper: self.20260309004425.083	Asynchronous CPU-Projected State Paper ID: self.20260309004425.083 - Hypothesis: SSM states are low-bandwidth compared to weights. By maintaining a 'hot' state on GPU and a 'cold' history on CPU (pinned memory), we can process effectively infinite context lengths within 8G...	03-09 00:45	Success	-	View
exp_self.20260309004248.082_20260309_004318 Paper: self.20260309004248.082	Variance-Gated Dynamic Quantization for SSMs This repository contains a benchmark suite designed to validate the Variance-Gated Dynamic Quantization hypothesis. Hypothesis Channels within the State Space Model (SSM) state tensor exhibit varying temporal activity. By tracking the r...	03-09 00:43	Success	-	View
exp_self.20260309004030.081_20260309_004118 Paper: self.20260309004030.081	Tiered State Offloading for Long Context Paper ID: self.20260309004030.081 - Hypothesis: Segregating the SSM hidden state into a 'hot' GPU resident state (recent tokens) and a 'cold' CPU resident state (older tokens) will allow for longer contexts than VRAM alone permits, with acc...	03-09 00:41	Success	-	View
exp_pytrain.20260309003850.043_20260309_003927 Paper: pytrain.20260309003850.043	Python Skill Fallback Title: Dynamic Type-Safe Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-09 00:39	Success	-	View
exp_self.20260309003642.080_20260309_003714 Paper: self.20260309003642.080	Logarithmic State Space Machine (LogSSM) Benchmark README.md Logarithmic State Space Machine (LogSSM) Benchmark Overview This benchmark evaluates the LogSSM innovation, which hypothesizes that storing SSM (State Space Model) states in a Logarithmic Number System (LNS) using 8-bit intege...	03-09 00:37	Success	-	View
exp_self.20260309003418.079_20260309_003451 Paper: self.20260309003418.079	Chronos-Decayed State Precision Benchmark Section 1: README.md Section 2: benchmark.py	03-09 00:35	Success	-	View
exp_pytrain.20260309003242.042_20260309_003303 Paper: pytrain.20260309003242.042	Generic Data Packet Router with PEP 695 Syntax This benchmark validates the implementation of a Generic Data Packet Router using Python 3.12's PEP 695 Type Parameter Syntax. Overview PEP 695 introduces a new, more concise syntax for declaring generics. This drill requires implementing a...	03-09 00:33	Success	-	View
exp_self.20260309002920.078_20260309_003106 Paper: self.20260309002920.078	Dual-Resolution State Management (DRSM) Paper ID: self.20260309002920.078 - Hypothesis: Splitting the SSM recurrent state into a 'hot' path (recent tokens) and 'cold' path (history) allows for aggressive compression of the history without significant performance degradation on lo...	03-09 00:31	Success	-	View
exp_self.20260309002649.077_20260309_002735 Paper: self.20260309002649.077	Innovation Benchmark: SSM + Cache + Dynamic Precision README.md Innovation Benchmark: SSM + Cache + Dynamic Precision Hypothesis Combining SSM (State Space Models), Cache (KV optimization), and Dynamic Precision (Mixed Precision/AMP) in a co-design architecture will improve through...	03-09 00:27	Success	-	View
exp_pytrain.20260309002503.041_20260309_002528 Paper: pytrain.20260309002503.041	--- README.md Type-Safe Entry Point Resolver System Overview This benchmark demonstrates a robust, type-safe plugin loading mechanism using Python's standard library. It simulates a package manager's ability to discover, load, and validate...	03-09 00:25	Success	-	View
exp_self.20260309000728.076_20260309_000814 Paper: self.20260309000728.076	Variance-Gated KV Cache Quantization (VGBKV) README.md Variance-Gated KV Cache Quantization (VGBKV) Concept Modern LLMs are bottlenecked by the memory bandwidth required to read the growing KV Cache during inference. Standard KV caches store 16-bit (FP16/BF16) vectors for every token....	03-09 00:23	Success	-	View
exp_pytrain.20260309000515.040_20260309_000551 Paper: pytrain.20260309000515.040	Type-Safe 'Mini-Tensor' Library Benchmark README.md Type-Safe 'Mini-Tensor' Library Benchmark Objective This benchmark evaluates a Python engineering system's ability to construct a modular, type-safe numerical library using only the Python Standard Library. The system must dem...	03-09 00:05	Success	-	View
exp_self.20260309000216.075_20260309_000242 Paper: self.20260309000216.075	This benchmark investigates the hypothesis that combining State Space Models (SSM), Caching mechanisms, and Dy... README.md This benchmark investigates the hypothesis that combining State Space Models (SSM), Caching mechanisms, and Dynamic Precision** can significantly improve throughput and memory efficiency compared to standard Transformer-...	03-09 00:02	Success	-	View
exp_pytrain.20260308235900.039_20260308_235924 Paper: pytrain.20260308235900.039	Extensible Type-Safe Plugin Registry This benchmark demonstrates a robust, scalable architecture pattern often seen in production ML frameworks (like Hugging Face Transformers or Diffusers), implemented entirely with Python standard library features. Overview The system implem...	03-08 23:59	Success	-	View
exp_self.20260308235554.074_20260308_235646 Paper: self.20260308235554.074	Asynchronous Delta-State Prefetching Paper ID: self.20260308235554.074 - Hypothesis: Transferring the full state from CPU to GPU causes stalls. Transferring only the delta (updates) allows overlapping computation and data transfer (async), improving throughput for large-contex...	03-08 23:56	Success	-	View
exp_self.20260308235409.073_20260308_235439 Paper: self.20260308235409.073	Linear-SSM Bridge Compression (LSBC) Paper ID: self.20260308235409.073 - Hypothesis: SSMs struggle with 'recall' of very distant context. Passing the SSM state through a Linear Attention layer every N steps allows the model to 'attend' to its own history more efficiently than...	03-08 23:54	Success	-	View
exp_pytrain.20260308235242.038_20260308_235301 Paper: pytrain.20260308235242.038	Dynamic Type-Safe Plugin Loader README.md Dynamic Type-Safe Plugin Loader Overview This benchmark tests the ability to dynamically construct a Python package on the file system and load it using the standard import machinery. It emphasizes strict typing using `typing.Prot...	03-08 23:53	Success	-	View
exp_self.20260308235054.072_20260308_235126 Paper: self.20260308235054.072	Semantic LRU for SSM State Windows Paper ID: self.20260308235054.072 - Hypothesis: In long-context conversations, recent tokens (LRU) are often filler. Replacing the state based on semantic similarity to the current query (e.g., cosine similarity of embeddings) will yield be...	03-08 23:51	Success	-	View
exp_self.20260308234916.071_20260308_234940 Paper: self.20260308234916.071	Innovation: Entropy-Gated Host-Side State Streaming (EG-HS3) README.md Innovation: Entropy-Gated Host-Side State Streaming (EG-HS3) Hypothesis High-entropy states in Selective State Space Models (SSMs) like Mamba carry unique information that is harder to compress but worth retaining in slower CPU me...	03-08 23:49	Success	-	View
exp_self.20260308234724.070_20260308_234748 Paper: self.20260308234724.070	Host-Side Linear Memory Pool (HS-LMP) Benchmark README.md Host-Side Linear Memory Pool (HS-LMP) Benchmark Overview This benchmark evaluates the Host-Side Linear Memory Pool (HS-LMP), a technique designed to extend the effective context window of State Space Models (SSMs), such as Mam...	03-08 23:48	Success	-	View
exp_pytrain.20260308234557.037_20260308_234628 Paper: pytrain.20260308234557.037	Strictly Typed Environment Metadata Inspector README.md Strictly Typed Environment Metadata Inspector Overview This coding drill validates the hypothesis that an autonomous coding system can bridge dynamic runtime introspection (packaging metadata) with static type safety (the `typing`...	03-08 23:46	Success	-	View
exp_self.20260308234339.069_20260308_234411 Paper: self.20260308234339.069	Variance-Gated Bitwidth (VGB) Paper ID: self.20260308234339.069 - Hypothesis: Not all state dimensions are equally important at all times. Dimensions with low variance (static memory) can be stored in FP8, while high-variance dimensions (active processing) require FP16....	03-08 23:44	Success	-	View
exp_self.20260308234204.068_20260308_234237 Paper: self.20260308234204.068	Entropy-Adaptive State Tiering (EAST) Reloaded README.md Entropy-Adaptive State Tiering (EAST) Reloaded Overview This benchmark implements Entropy-Adaptive State Tiering (EAST), a memory optimization technique for State Space Models (SSMs) and Large Language Models (LLMs). The Hypot...	03-08 23:42	Success	-	View
exp_pytrain.20260308233955.036_20260308_234021 Paper: pytrain.20260308233955.036	Python Skill Fallback Title: Dynamic Module Loader with Strict Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-08 23:40	Success	-	View
exp_self.20260308233749.067_20260308_233831 Paper: self.20260308233749.067	Semantic Bitwidth Allocation (SBA) Benchmark README.md This repository contains a runnable benchmark for Semantic Bitwidth Allocation (SBA), a novel technique designed to optimize memory bandwidth in State Space Models (SSMs) like Mamba. The Innovation Standard SSMs maintain a sta...	03-08 23:38	Success	-	View
exp_self.20260308233431.066_20260308_233504 Paper: self.20260308233431.066	Token-Triggered Precision Decay (TTPD) Benchmark README.md Token-Triggered Precision Decay (TTPD) Benchmark This repository contains a micro-benchmark designed to validate the Token-Triggered Precision Decay (TTPD) hypothesis. Hypothesis Recent tokens in a Sequence Modeling (SSM) stat...	03-08 23:35	Success	-	View
exp_pytrain.20260308233233.035_20260308_233258 Paper: pytrain.20260308233233.035	Strictly Typed Generic Dispatcher with API Isolation README.md Strictly Typed Generic Dispatcher with API Isolation Overview This coding drill verifies the implementation of a library-grade `EventBus[T]` using Python 3.12's Type Parameter Syntax (PEP 695). The goal is to demonstrate how moder...	03-08 23:33	Success	-	View
exp_self.20260308233023.065_20260308_233055 Paper: self.20260308233023.065	--- README.md --- TASP Benchmark: Token-Adaptive State Precision This benchmark evaluates the Token-Adaptive State Precision (TASP) innovation for Mamba-style State Space Models (SSMs). Hypothesis Tokens with low entropy (e.g., punctuation,...	03-08 23:30	Success	-	View
exp_self.20260308232737.064_20260308_232825 Paper: self.20260308232737.064	Here is the runnable benchmark design for the Bi-Precision State Streaming (BPSS) innovation. No summary available yet.	03-08 23:28	Success	-	View
exp_pytrain.20260308232548.034_20260308_232614 Paper: pytrain.20260308232548.034	Type-Safe Plugin Registry Benchmark README.md Type-Safe Plugin Registry Benchmark This benchmark evaluates the implementation of a modular, type-safe command registry using Python's standard library type hinting features. Overview The design leverages `typing.Protocol` and `t...	03-08 23:26	Success	-	View
exp_self.20260308232252.063_20260308_232358 Paper: self.20260308232252.063	Hybrid CPU-GPU State Streaming (H-CGS) Paper ID: self.20260308232252.063 - Hypothesis: Decoupling the state update (fast, GPU) from the state storage (large, CPU) allows processing sequences 4x longer than GPU VRAM would normally allow with negligible latency penalty. - Plan: 1....	03-08 23:24	Success	-	View
exp_2603.06577v1_20260308_232106 Paper: 2603.06577v1	Section 1: README.md bash python benchmark.py	03-08 23:21	Success	-	View
exp_pytrain.20260308231824.033_20260308_231900 Paper: pytrain.20260308231824.033	Type-Driven Plugin System Drill README.md Type-Driven Plugin System Drill Overview This benchmark tests your ability to design a robust, type-safe Python library architecture using `typing.Protocol` and `typing.Generic`. The goal is to create a "Task Executor" system wher...	03-08 23:19	Success	-	View
exp_self.20260308231603.062_20260308_231640 Paper: self.20260308231603.062	Here is the design for the Pinned-State Swap Scheduler (PSSS) benchmark. Benchmark Design Overview This benchmark tests the Pinned-State Swap Scheduler (PSSS) hypothesis. It simulates a workload consisting of alternating SSM layers (which rely on a large hidden state) and MLP layers (which are comput...	03-08 23:16	Success	-	View
exp_self.20260308231322.061_20260308_231405 Paper: self.20260308231322.061	Delta-Indexed Semantic Cache (DISC) Paper ID: self.20260308231322.061 - Hypothesis: Using the derivative of the SSM state as a query key into a compressed KV-cache will allow retrieval of relevant distant context with O(1) complexity, improving perplexity on long-context task...	03-08 23:14	Success	-	View
exp_pytrain.20260308231143.032_20260308_231213 Paper: pytrain.20260308231143.032	Python Skill Fallback Title: Strict Type-Safe Plugin Registry - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-08 23:12	Success	-	View
exp_self.20260308230918.060_20260308_231004 Paper: self.20260308230918.060	Associative State Patching (ASP) Benchmark README.md Associative State Patching (ASP) Benchmark This benchmark evaluates the Associative State Patching (ASP) technique applied to State Space Models (SSMs). Hypothesis SSMs are prone to 'state drift' over long sequences. ASP maint...	03-08 23:10	Success	-	View
exp_self.20260308230715.059_20260308_230748 Paper: self.20260308230715.059	Gated Linear-Attention State Bridge (GLA-Bridge) Paper ID: self.20260308230715.059 - Hypothesis: SSMs struggle with exact recall. Gating the SSM state with a Linear Attention summary of the input history will allow the model to 'lookup' past tokens explicitly without an $O(N^2)$ cost. - P...	03-08 23:07	Success	-	View
exp_pytrain.20260308230520.031_20260308_230555 Paper: pytrain.20260308230520.031	Robust Generic Command Bus Implementation README.md Robust Generic Command Bus Implementation This benchmark implements a production-ready Command Bus pattern using only the Python Standard Library. Architecture The design enforces strict decoupling between the Request (Com...	03-08 23:06	Success	-	View
exp_self.20260308230315.058_20260308_230339 Paper: self.20260308230315.058	```markdown README.md bash python benchmark.py	03-08 23:03	Success	-	View
exp_self.20260308230054.057_20260308_230129 Paper: self.20260308230054.057	CPU-Pinned Sparse Associative Memory (CPSAM) Paper ID: self.20260308230054.057 - Hypothesis: The hidden state $H_t$ can be sparsified and stored in pinned CPU memory. A lightweight 'gate' on the GPU determines if the CPU state is needed, preventing full-GPU history storage. - Plan: Im...	03-08 23:01	Success	-	View
exp_pytrain.20260308225819.030_20260308_225849 Paper: pytrain.20260308225819.030	Generic Asynchronous Event Dispatcher Benchmark README.md Generic Asynchronous Event Dispatcher Benchmark Overview This benchmark validates the design of a strictly typed, generic asynchronous event dispatcher using Python's standard library. It demonstrates the creation of a robust, tes...	03-08 22:59	Success	-	View
exp_self.20260308225613.056_20260308_225643 Paper: self.20260308225613.056	Sparse Associative State Injection Paper ID: self.20260308225613.056 - Hypothesis: Instead of a monolithic state vector, we maintain a sparse set of 'memory slots' updated by the SSM. During generation, we perform a sparse lookup (KNN) on these slots to inject relevant histo...	03-08 22:56	Success	-	View
exp_self.20260308225403.055_20260308_225449 Paper: self.20260308225403.055	Semantic State Delta Caching (SSDC) README.md Semantic State Delta Caching (SSDC) Innovation Semantic State Delta Caching (SSDC) improves the inference speed of State Space Models (SSMs) by caching internal state vectors based on input token hashes. Concept Traditional KV cac...	03-08 22:54	Success	-	View
exp_pytrain.20260308225136.029_20260308_225212 Paper: pytrain.20260308225136.029	Generic Type-Safe Event Dispatcher Benchmark README.md Generic Type-Safe Event Dispatcher Benchmark Design Brief This benchmark demonstrates a modular, single-file Python package implementation that leverages advanced static typing features. It simulates a package structure using clas...	03-08 22:52	Success	-	View
exp_2603.06576v1_20260308_225001 Paper: 2603.06576v1	Section 1: README.md bash pip install torch python benchmark.py	03-08 22:50	Success	-	View
exp_self.20260308224720.054_20260308_224804 Paper: self.20260308224720.054	Hybrid KV-SSM Cache Injection Overview This benchmark evaluates the Hybrid KV-SSM Cache Injection architecture. This innovation combines the long-range comprehension of State Space Models (SSMs) with the precise, factual recall of a sliding-window KV cache. The Inno...	03-08 22:48	Success	-	View
exp_pytrain.20260308224510.028_20260308_224538 Paper: pytrain.20260308224510.028	Robust Type-Safe Plugin Loader README.md Robust Type-Safe Plugin Loader Overview This benchmark evaluates a developer's ability to construct a secure, extensible plugin architecture in Python using only the standard library. The task involves creating a `PluginManager` c...	03-08 22:45	Success	-	View
exp_self.20260308224304.053_20260308_224342 Paper: self.20260308224304.053	Delta-State Accumulator with CPU Offload README.md Delta-State Accumulator with CPU Offload Innovation Overview This benchmark evaluates a "Delta-State Accumulator" technique for Selective State Space Models (SSMs), specifically optimizing for GPU memory constraints. **Hypothesis:...	03-08 22:43	Success	-	View
exp_self.20260308224109.052_20260308_224138 Paper: self.20260308224109.052	Heterogeneous State Tiering (HST) Benchmark README.md Heterogeneous State Tiering (HST) Benchmark This repository contains a runnable benchmark for the Heterogeneous State Tiering (HST) proposal. Concept HST proposes an OS Paging-inspired approach to Sequence Model (SSM) memory m...	03-08 22:41	Success	-	View
exp_pytrain.20260308223912.027_20260308_223936 Paper: pytrain.20260308223912.027	Benchmark: Strictly Typed Dynamic Plugin Registry README.md Benchmark: Strictly Typed Dynamic Plugin Registry This benchmark tests the ability to construct a robust, zero-dependency extension framework using Python's standard library. It simulates a "model packaging system" often found in...	03-08 22:39	Success	-	View
exp_hf_2603.05888_20260308_223738 Paper: hf_2603.05888	PixARMesh Benchmark README.md PixARMesh Benchmark This benchmark evaluates the `PixARMesh` architecture for autoregressive 3D scene reconstruction. It specifically highlights the efficiency of using State Space Models (SSM/Mamba) for processing long sequen...	03-08 22:37	Success	-	View
exp_self.20260308223444.051_20260308_223517 Paper: self.20260308223444.051	Entropy-Adaptive Precision State Machine Benchmark This repository contains the implementation and benchmarking code for the Entropy-Adaptive Precision State Machine. Overview Traditional State Space Models (SSMs) and sequence models maintain state in full precision (FP32) regardless of...	03-08 22:35	Success	-	View
exp_pytrain.20260308223301.026_20260308_223319 Paper: pytrain.20260308223301.026	--- README.md Typed Component Registry and Dynamic Loader This benchmark demonstrates the implementation of a robust, type-safe plugin registry system using Python's standard library `typing` module. It mimics the extensibility patterns found i...	03-08 22:33	Success	-	View
exp_self.20260308223042.050_20260308_223133 Paper: self.20260308223042.050	Delta State Quantization (DSQ) for Streaming Paper ID: self.20260308223042.050 - Hypothesis: State changes ($h_t - h_{t-1}$) are sparser and lower magnitude than the state $h_t$. Storing the delta in 4-bit INT and the base state in 16-bit FP reduces memory bandwidth for state updates....	03-08 22:31	Success	-	View
exp_self.20260308222831.049_20260308_222913 Paper: self.20260308222831.049	CPU-Pinned Historical State Buffer (CHSB) README.md CPU-Pinned Historical State Buffer (CHSB) Innovation Summary Standard State Space Models (SSMs) like Mamba require maintaining a hidden state tensor that grows with sequence length. On GPU-constrained hardware (e.g., 8GB VRAM), th...	03-08 22:29	Success	-	View
exp_pytrain.20260308222619.025_20260308_222652 Paper: pytrain.20260308222619.025	```markdown README.md	03-08 22:26	Success	-	View
exp_self.20260308222344.048_20260308_222413 Paper: self.20260308222344.048	Linear Attention Hybrid IO-Layer Benchmark README.md Linear Attention Hybrid IO-Layer Benchmark This benchmark evaluates the Hybrid IO-Layer, a novel architecture combining the efficiency of State Space Models (SSMs) for long-term history with the precision of Linear Attention f...	03-08 22:24	Success	-	View
exp_self.20260308222134.047_20260308_222201 Paper: self.20260308222134.047	SSM + Cache + Dynamic Precision Benchmark README.md SSM + Cache + Dynamic Precision Benchmark This benchmark investigates the hypothesis that combining State Space Models (SSM), efficient caching mechanisms, and dynamic precision (Automatic Mixed Precision) can yield better memory...	03-08 22:22	Success	-	View
exp_pytrain.20260308221857.024_20260308_221942 Paper: pytrain.20260308221857.024	Strictly-Typed Plugin Pipeline Benchmark README.md Strictly-Typed Plugin Pipeline Benchmark Overview This coding drill validates the implementation of a robust, strictly-typed data processing pipeline using Python's standard `typing` module. It demonstrates the use of `Protocol` f...	03-08 22:19	Success	-	View
exp_self.20260308221650.046_20260308_221719 Paper: self.20260308221650.046	Sparse State History Retrieval (SSHR) Benchmark This benchmark tests the hypothesis that offloading state history to a CPU-side KNN index (FAISS) and injecting the nearest neighbor into the current SSM step improves long-term retention without increasing the recurrent state size. Hypothe...	03-08 22:17	Success	-	View
exp_self.20260308221355.045_20260308_221441 Paper: self.20260308221355.045	--- README.md bash python benchmark.py	03-08 22:14	Success	-	View
exp_pytrain.20260308221201.023_20260308_221223 Paper: pytrain.20260308221201.023	Python Skill Fallback Title: Type-Safe Asynchronous Entry Point Dispatcher - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-08 22:12	Success	-	View
exp_gh_obss_sahi_20260308_221028 Paper: gh_obss_sahi	obss/sahi Paper ID: gh_obss_sahi - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark s...	03-08 22:10	Success	-	View
exp_self.20260308220740.044_20260308_220836 Paper: self.20260308220740.044	Entropy-Adaptive State Tiering (EAST) Benchmark README.md Entropy-Adaptive State Tiering (EAST) Benchmark This benchmark validates the EAST hypothesis: Low-entropy (stable/boring) states in State Space Models (SSMs) can be offloaded to CPU pinned memory without significantly degradin...	03-08 22:08	Success	-	View
exp_pytrain.20260308220502.022_20260308_220549 Paper: pytrain.20260308220502.022	PEP 695 Type Parameter Syntax & Module Hygiene Overview This benchmark evaluates a developer's implementation of Python 3.12's PEP 695 Type Parameter Syntax and module hygiene standards. It verifies that the provided module uses the new generic syntax (e.g., `class MyClass[T]:`, `def fu...	03-08 22:05	Success	-	View
exp_self.20260308220301.043_20260308_220336 Paper: self.20260308220301.043	Benchmark: SSM + Cache Co-Design with Dynamic Precision README.md Benchmark: SSM + Cache Co-Design with Dynamic Precision Hypothesis Combining State Space Models (SSM), explicit State Caching, and Dynamic Precision (AMP) will yield higher throughput and lower VRAM usage compared to a standard Tr...	03-08 22:03	Success	-	View
exp_hf_2603.06569_20260308_220047 Paper: hf_2603.06569	```markdown bash python benchmark.py	03-08 22:00	Success	-	View
exp_pytrain.20260308215722.021_20260308_215759 Paper: pytrain.20260308215722.021	Dynamic Plugin Architecture with Strict Typing README.md Dynamic Plugin Architecture with Strict Typing This benchmark tests the ability to implement a robust, dynamic plugin loading system using Python's standard library. It focuses on simulating a packaging workflow where package stru...	03-08 21:58	Success	-	View
exp_self.20260308215449.042_20260308_215605 Paper: self.20260308215449.042	CPU-Pinned Sparse State Recycling README.md CPU-Pinned Sparse State Recycling This benchmark implements and evaluates a memory-efficient State Space Model (SSM) inference technique designed to extend context windows beyond standard GPU VRAM limitations. Concept Standard SSM...	03-08 21:56	Success	-	View
exp_self.20260308215147.041_20260308_215234 Paper: self.20260308215147.041	Entropy-Adaptive KV Cache Quantization Paper ID: self.20260308215147.041 - Hypothesis: Tokens with low entropy (predictable) can be stored in 4-bit without loss, while high-entropy tokens require 8-bit. This adaptive method preserves coherence where it matters most. - Plan: Hook...	03-08 21:52	Success	-	View
exp_pytrain.20260308214956.020_20260308_215027 Paper: pytrain.20260308214956.020	Strictly-Typed Dynamic Package Loader README.md Strictly-Typed Dynamic Package Loader Overview This coding drill tests your ability to dynamically generate Python packages, enforce strict static typing using Generics (`typing.Generic`), and validate package structure programmat...	03-08 21:50	Success	-	View
exp_self.20260308214806.040_20260308_214831 Paper: self.20260308214806.040	```markdown bash python benchmark.py	03-08 21:48	Success	-	View
exp_self.20260308214513.039_20260308_214539 Paper: self.20260308214513.039	Asynchronous CPU-Pinned State Ringbuffer for SSMs README.md Asynchronous CPU-Pinned State Ringbuffer for SSMs This benchmark demonstrates a novel memory management technique for State-Space Models (SSMs), specifically targeting Mamba-style architectures. By exploiting the natural decay of...	03-08 21:46	Success	-	View
exp_pytrain.20260308214254.019_20260308_214321 Paper: pytrain.20260308214254.019	Design one runnable Python coding drill benchmark. STRICT REQUIREMENT: Output two sections separated by '	03-08 21:43	Success	-	View
exp_self.20260308213951.038_20260308_214037 Paper: self.20260308213951.038	Section 1: README.md Section 2: benchmark.py README.md content: - Title, Hypothesis, Setup, Usage. benchmark.py content: - Import torch, time, gc. - Define constants. - Class `DRSPCache` implementing the tiered logic. - Class `StandardCache` for baseline. - `ru...	03-08 21:41	Success	-	View
exp_self.20260308213821.037_20260308_213847 Paper: self.20260308213821.037	Section 1: README.md Hybrid Attention-SSM Corrector (HASC) Benchmark Innovation The Hybrid Attention-SSM Corrector (HASC) enhances standard Selective State Space Models (SSMs) like Mamba by injecting a local attention vector into the state update mechanism....	03-08 21:38	Success	-	View
exp_pytrain.20260308213637.018_20260308_213701 Paper: pytrain.20260308213637.018	Title: Strict Data Processor Module Design README.md Title: Strict Data Processor Module Design Description: This benchmark evaluates the creation of a robust, reusable generic pipeline system using Python's standard typing utilities. The candidate must implement a `Pipeline...	03-08 21:37	Success	-	View
exp_self.20260308213236.036_20260308_213311 Paper: self.20260308213236.036	Gradient-Modulated State Quantization (GMSQ) README.md Gradient-Modulated State Quantization (GMSQ) Innovation: Dynamic Precision + SSM Hypothesis: Timesteps with high gradient magnitude require higher precision state retention, while 'flat' regions can survive 4-bit or 2-bit...	03-08 21:35	Success	-	View
exp_self.20260308213037.035_20260308_213115 Paper: self.20260308213037.035	Here is the design for the Semantic Partitioned State Space (SPSS) benchmark. Section 1 contains the documentation. Section 2 contains the runnable Python benchmark. bash python benchmark.py ```	03-08 21:31	Success	-	View
exp_pytrain.20260308212853.017_20260308_212910 Paper: pytrain.20260308212853.017	```markdown README.md bash python benchmark.py Generating temporary package structure... Loading module from tmp_pkg/processor.py... Validating against StrictValidator protocol... VRAM_USAGE: 0.00MB TOKENS_PER_SEC: <calculated_value> VERIFIED: PASSED	03-08 21:29	Success	-	View
exp_self.20260308212625.034_20260308_212702 Paper: self.20260308212625.034	Spectral State Compression (SSC) Benchmark This benchmark evaluates the hypothesis that SSM hidden states can be compressed in the frequency domain (using FFT) to save memory with minimal degradation in model performance (perplexity). README.md bash python benchmark.py	03-08 21:27	Success	-	View
exp_self.20260308212442.033_20260308_212510 Paper: self.20260308212442.033	Speculative State Offloading (SSO) Benchmark README.md Speculative State Offloading (SSO) Benchmark This benchmark validates the Speculative State Offloading (SSO) hypothesis, which posits that state evolution in State Space Models (SSMs) is sufficiently smooth to be approximated...	03-08 21:25	Success	-	View
exp_pytrain.20260308212245.016_20260308_212309 Paper: pytrain.20260308212245.016	Typed Dependency Graph Resolver README.md Typed Dependency Graph Resolver This benchmark evaluates the implementation of a robust `DependencyResolver` using Python's modern typing features. Objective Implement a dependency resolution algorithm that calculates a valid inst...	03-08 21:23	Success	-	View
exp_self.20260308211846.032_20260308_211946 Paper: self.20260308211846.032	Entropy-Gated Token-Wise State Precision README.md Entropy-Gated Token-Wise State Precision Overview This benchmark evaluates an optimization technique for State Space Models (SSMs) and Recurrent Architectures. It tests the hypothesis that not all tokens require full-precision (FP...	03-08 21:19	Success	-	View
exp_pytrain.20260308211623.015_20260308_211656 Paper: pytrain.20260308211623.015	Dynamic Package Loading with Structural Typing Validation Overview This benchmark tests the ability to construct a robust Python plugin system. It demonstrates dynamic module discovery, loading from an arbitrary file system location, and structural interface validation using Python's `typing.Proto...	03-08 21:17	Success	-	View
exp_hf_2603.06351_20260308_211436 Paper: hf_2603.06351	Dynamic Chunking Diffusion Transformer Paper ID: hf_2603.06351 - Hypothesis: Benchmark a simplified recovered baseline against an ablated variant. - Plan: Run the deterministic recovery benchmark and capture VRAM plus throughput telemetry. - Expected Signal: Recovered benchmark...	03-08 21:14	Success	-	View
exp_self.20260308211115.031_20260308_211210 Paper: self.20260308211115.031	Innovation: Log-State Numerical Stability (LSNS) README.md Innovation: Log-State Numerical Stability (LSNS) Overview This benchmark investigates the hypothesis that performing State Space Model (SSM) state updates in the logarithmic domain improves numerical fidelity on long sequences com...	03-08 21:12	Success	-	View
exp_pytrain.20260308210854.014_20260308_210939 Paper: pytrain.20260308210854.014	Drill: Strictly Typed Configuration Module with CLI Interface Adhering to strict `typing` protocols (TypedDict, Protocol) and packaging standards (versioning, `__all__`, entry-point simulation) within a single script significantly reduces runtime errors and improves the maintainability of configuratio...	03-08 21:09	Success	-	View
exp_self.20260308210647.030_20260308_210731 Paper: self.20260308210647.030	VRAM-Responsive State Eviction (VRSE) Benchmark README.md VRAM-Responsive State Eviction (VRSE) Benchmark This repository contains a benchmark designed to test the VRSE innovation. Hypothesis Applying a cache policy (e.g., LRU) to the batch state dimension of State Space Models (SS...	03-08 21:07	Success	-	View
exp_self.20260308210434.029_20260308_210514 Paper: self.20260308210434.029	```markdown README.md	03-08 21:05	Success	-	View
exp_pytrain.20260308210227.013_20260308_210302 Paper: pytrain.20260308210227.013	Python Skill Fallback Title: Strict Entry Point Dispatcher - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-08 21:03	Success	-	View
exp_self.20260308205807.028_20260308_205932 Paper: self.20260308205807.028	Temporal-Decay State Precision (TDSP) Benchmark README.md Temporal-Decay State Precision (TDSP) Benchmark This benchmark evaluates the TDSP innovation, which hypothesizes that recent token history requires BF16 precision for gradient stability, while older history (state) can be main...	03-08 20:59	Success	-	View
exp_pytrain.20260308205537.012_20260308_205616 Paper: pytrain.20260308205537.012	Dynamic Module Packaging and Runtime Type Verification Overview This benchmark tests the ability to construct Python packaging tooling from scratch using only the standard library. It validates a system's capability to perform file system operations, dynamic code generation, runtime module impo...	03-08 20:56	Success	-	View
exp_self.20260308205343.027_20260308_205410 Paper: self.20260308205343.027	Contiguous-Buffer State Offload (CBSO) Benchmark README.md Contiguous-Buffer State Offload (CBSO) Benchmark This benchmark evaluates the CBSO innovation, designed to mitigate device synchronization crashes and optimize VRAM usage in State Space Models (SSMs) like Mamba. The Innovation...	03-08 20:54	Success	-	View
exp_hf_2603.06199_20260308_205152 Paper: hf_2603.06199	FlashPrefill Benchmark Overview This benchmark evaluates the performance characteristics of FlashPrefill, a framework designed for ultra-fast long-context prefilling. It compares the proposed method against a standard Dense Attention baseline. **Key Innovatio...	03-08 20:52	Success	-	View
exp_pytrain.20260308204912.011_20260308_204941 Paper: pytrain.20260308204912.011	Python Skill Fallback Title: Structural Subtyping Plugin System - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-08 20:49	Success	-	View
exp_self.20260308204701.026_20260308_204739 Paper: self.20260308204701.026	Project: ARES Benchmark Prototype (SSM + Cache + Dynamic Precision) README.md Project: ARES Benchmark Prototype (SSM + Cache + Dynamic Precision) Description: This benchmark investigates the hypothesis that a co-design of State Space Models (SSM), State Caching, and Dynamic Precision can yield super...	03-08 20:47	Success	-	View
exp_self.20260308204439.025_20260308_204520 Paper: self.20260308204439.025	Variance-Based Dynamic State Precision Benchmark This benchmark evaluates a novel optimization for State Space Models (SSMs), specifically targeting the Mamba architecture. The core hypothesis is that the hidden state within the SSM recurrence does not require uniform FP16 precision. By c...	03-08 20:45	Success	-	View
exp_pytrain.20260308204222.010_20260308_204311 Paper: pytrain.20260308204222.010	Type-Safe Async Worker Simulation Benchmark README.md Type-Safe Async Worker Simulation Benchmark Objective This benchmark evaluates the ability to construct a production-ready Python module that adheres to strict software engineering standards. The goal is to create `async_worker.py...	03-08 20:43	Success	-	View
exp_self.20260308202337.024_20260308_202444 Paper: self.20260308202337.024	--- README.md Benchmark: Entropy-Thresholded Dynamic State Quantization (Mamba) This benchmark implements and tests an innovation applied to State Space Models (SSMs), specifically targeting the Mamba architecture. Hypothesis The hidden sta...	03-08 20:39	Success	-	View
exp_self.20260308202042.023_20260308_202133 Paper: self.20260308202042.023	Here is the design for the SSM-Guided KV Cache Eviction benchmark. No summary available yet.	03-08 20:21	Success	-	View
exp_pytrain.20260308201908.009_20260308_201925 Paper: pytrain.20260308201908.009	Benchmark: Runtime-Checked Plugin Discovery System README.md Benchmark: Runtime-Checked Plugin Discovery System Hypothesis An autonomous system can robustly implement a modular architecture by leveraging Python's `importlib` for dynamic code loading and `typing.Protocol` for structural subt...	03-08 20:19	Success	-	View
exp_self.20260308201708.022_20260308_201749 Paper: self.20260308201708.022	Benchmark: Linear-Attention State Priming (LASP) README.md Benchmark: Linear-Attention State Priming (LASP) Hypothesis Standard State Space Models (SSMs) like Mamba theoretically handle infinite context, but in practice, the recurrent hidden state $h_t$ acts as a lossy bottleneck. Informa...	03-08 20:18	Success	-	View
exp_self.20260308201444.021_20260308_201539 Paper: self.20260308201444.021	Mixed-Precision State Segments Benchmark This benchmark evaluates the "Mixed-Precision State Segments" hypothesis, specifically applied to Mamba-style State Space Models (SSMs). It aims to demonstrate that by profiling state gradients to identify sensitive dimensions, we can store...	03-08 20:15	Success	-	View
exp_pytrain.20260308201205.008_20260308_201326 Paper: pytrain.20260308201205.008	Python Skill Fallback Title: PEP 440 Semantic Version Resolver & Validator - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-08 20:13	Success	-	View
exp_self.20260308200958.020_20260308_201025 Paper: self.20260308200958.020	Hybrid Mamba-Linear Router (HMLR) Paper ID: self.20260308200958.020 - Hypothesis: High-entropy tokens require the recall of Linear Attention, while low-entropy tokens are efficiently handled by SSM recurrence. A per-token router will lower VRAM usage (via SSM) while maintai...	03-08 20:10	Success	-	View
exp_self.20260308200824.019_20260308_200855 Paper: self.20260308200824.019	This benchmark evaluates the LoRA-Dynamic State Expansion technique for efficient sequence modeling. README.md This benchmark evaluates the LoRA-Dynamic State Expansion technique for efficient sequence modeling. Concept Standard State Space Models (SSMs) like Mamba maintain a large hidden state to handle long-range dependencies, leadin...	03-08 20:09	Success	-	View
exp_self.20260308200635.018_20260308_200700 Paper: self.20260308200635.018	Entropy-Gated Sparse State Paper ID: self.20260308200635.018 - Hypothesis: Not every token requires a full state update. For low-entropy tokens (stopwords, punctuation), we can skip updating 50% of the state dimensions (Top-K update) without degrading coherence. - Pl...	03-08 20:07	Success	-	View
exp_pytrain.20260308200505.007_20260308_200531 Paper: pytrain.20260308200505.007	Typed Modular Plugin Registry README.md Typed Modular Plugin Registry This benchmark evaluates the design and performance of a robust, type-safe component registry using Python's `typing` module. It simulates a micro-kernel architecture where a central registry manages...	03-08 20:05	Success	-	View
exp_self.20260308200236.017_20260308_200309 Paper: self.20260308200236.017	Entropy-Adaptive State Quantization (EASQ) Benchmark README.md Entropy-Adaptive State Quantization (EASQ) Benchmark This repository contains a minimal, runnable benchmark for the Entropy-Adaptive State Quantization (EASQ) innovation. Hypothesis Tokens with low information entropy (e.g., p...	03-08 20:03	Success	-	View
exp_self.20260308200055.016_20260308_200128 Paper: self.20260308200055.016	Student hypothesis: ssm + cache co-design Paper ID: self.20260308200055.016 - Hypothesis: Combining ssm + cache + dynamic_precision will improve throughput or memory efficiency without breaking 8GB execution. - Plan: Create a compact comparative benchmark against a simple baseline,...	03-08 20:01	Success	-	View
exp_pytrain.20260308195832.006_20260308_195854 Paper: pytrain.20260308195832.006	Generic Plugin Registry with Protocol Constraints Drill Overview This benchmark evaluates your ability to design robust, type-safe polymorphic architectures using Python's advanced type system (`typing.Protocol`, `typing.TypeVar`, `typing.Generic`), mirroring patterns found in high-perform...	03-08 19:59	Success	-	View
exp_self.20260308195640.015_20260308_195718 Paper: self.20260308195640.015	Local Attention-SSM Error Correction Loop README.md Local Attention-SSM Error Correction Loop Innovation This benchmark implements a Local Attention-SSM Error Correction Loop, a hybrid architecture combining State Space Models (SSMs) with local sliding-window attention. Hypothe...	03-08 19:57	Success	-	View
exp_self.20260308195529.014_20260308_195547 Paper: self.20260308195529.014	Sink-Token State Initialization Benchmark README.md Sink-Token State Initialization Benchmark This benchmark evaluates the Sink-Token State Initialization technique for State Space Models (SSMs). The Innovation Standard SSMs (like Mamba) initialize their recurrent state $h_0$ t...	03-08 19:55	Success	-	View
exp_self.20260308195353.013_20260308_195421 Paper: self.20260308195353.013	Student hypothesis: ssm + cache co-design Paper ID: self.20260308195353.013 - Hypothesis: Combining ssm + cache + dynamic_precision will improve throughput or memory efficiency without breaking 8GB execution. - Plan: Create a compact comparative benchmark against a simple baseline,...	03-08 19:54	Success	-	View
exp_pytrain.20260308195213.005_20260308_195234 Paper: pytrain.20260308195213.005	Strictly Typed Dynamic Plugin Registry Overview This benchmark is a self-contained Python script designed to test a developer's ability to implement a robust, strictly-typed plugin architecture using Python's standard library `typing` module. It simulates a micro-packaging envir...	03-08 19:52	Success	-	View
exp_self.20260308195013.012_20260308_195102 Paper: self.20260308195013.012	Gradient-Checkpointing State Streaming Benchmark README.md Gradient-Checkpointing State Streaming Benchmark This benchmark validates the Keyframe Caching innovation, which applies gradient-checkpointing principles to State Space Model (SSM) inference. The Problem Standard SSM inferenc...	03-08 19:51	Success	-	View
exp_self.20260308194845.011_20260308_194916 Paper: self.20260308194845.011	Recency-Stratified State Precision (RSSP) Paper ID: self.20260308194845.011 - Hypothesis: SSM state vectors suffer primarily from quantization error in the immediate recurrence window; older history can be aggressively quantized to 4-bit or binary with minimal performance loss. - P...	03-08 19:49	Success	-	View
exp_self.20260308194641.010_20260308_194718 Paper: self.20260308194641.010	SSM + Cache + Dynamic Precision Co-design Benchmark README.md SSM + Cache + Dynamic Precision Co-design Benchmark Hypothesis Combining State Space Models (SSMs), State Caching, and Dynamic Precision (Mixed Precision) will significantly improve inference throughput and memory efficiency compa...	03-08 19:47	Success	-	View
exp_pytrain.20260308194525.004_20260308_194547 Paper: pytrain.20260308194525.004	Python Skill Fallback Title: Dynamic Type-Safe Plugin Loader - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-08 19:45	Success	-	View
exp_self.20260308194354.009_20260308_194421 Paper: self.20260308194354.009	README.md No summary available yet.	03-08 19:44	Success	-	View
exp_self.20260308194207.008_20260308_194230 Paper: self.20260308194207.008	Dynamic LoRA Injection for State Decay README.md Dynamic LoRA Injection for State Decay Hypothesis In State Space Models (SSMs) like Mamba, the `dt` (delta time-step) parameter acts as a gate, controlling the balance between long-term history (global context) and immediate input...	03-08 19:42	Success	-	View
exp_self.20260308194041.007_20260308_194103 Paper: self.20260308194041.007	Benchmark: Time-Decay Weighted State Cache for SSMs README.md Benchmark: Time-Decay Weighted State Cache for SSMs Overview This benchmark evaluates the Time-Decay Weighted State Cache innovation. The hypothesis is that standard State Space Models (SSMs) suffer from unbounded state growth...	03-08 19:41	Success	-	View
exp_pytrain.20260308193907.003_20260308_193926 Paper: pytrain.20260308193907.003	Python Skill Fallback Title: Asyncio-Driven Service Registry with Protocol Enforcement - Focus: typing, packaging - Note: Generated fallback due to unavailable model output.	03-08 19:39	Success	-	View
exp_self.20260308193714.006_20260308_193749 Paper: self.20260308193714.006	Hybrid Linear-SSM State Fusion Benchmark README.md Hybrid Linear-SSM State Fusion Benchmark This repository contains the implementation and benchmarking code for the Hybrid Linear-SSM State Fusion architecture. Concept The standard implementation of Linear Attention layers req...	03-08 19:37	Success	-	View
exp_self.20260308193524.005_20260308_193555 Paper: self.20260308193524.005	Section 1: README.md bash pip install torch numpy scipy bash python benchmark.py [Baseline] VRAM_USAGE: 1200MB TOKENS_PER_SEC: 85.5 [DSRA] VRAM_USAGE: 950MB TOKENS_PER_SEC: 90.2 RESULT: DSRA reduces VRAM by X% and improves TPS by Y%.	03-08 19:36	Success	-	View
exp_self.20260308193327.004_20260308_193356 Paper: self.20260308193327.004	Here is the design for the "Student hypothesis: ssm + cache + dynamic_precision" benchmark. README.md Benchmark: SSM + Cache + Dynamic Precision Co-design Hypothesis Combining SSM (State Space Models), Cache (state persistence), and Dynamic Precision (BF16/AMP) will significantly improve memory efficiency (VRAM) and th...	03-08 19:34	Success	-	View
exp_pytrain.20260308193143.002_20260308_193207 Paper: pytrain.20260308193143.002	Dynamic Package Construction with PEP 695 Generics This benchmark evaluates a system's ability to programmatically generate a valid Python package structure and utilize modern typing features introduced in Python 3.12 (PEP 695). Overview The script attempts to: 1. Create a temporary file sy...	03-08 19:32	Success	-	View
exp_self.20260308192941.003_20260308_193013 Paper: self.20260308192941.003	Channel-Wise Adaptive State Quantization (WASQ) README.md Channel-Wise Adaptive State Quantization (WASQ) Overview This benchmark implements the Channel-Wise Adaptive State Quantization (WASQ) innovation for State Space Models (SSMs). It tests the hypothesis that allocating heterogen...	03-08 19:30	Success	-	View
exp_self.20260308192753.002_20260308_192831 Paper: self.20260308192753.002	Low-Rank State Projection (LoRSP) Benchmark README.md Low-Rank State Projection (LoRSP) Benchmark Innovation Description Low-Rank State Projection (LoRSP) is a technique designed to optimize the CPU offloading of State Space Model (SSM) hidden states. The Problem: In SSMs (li...	03-08 19:28	Success	-	View
exp_self.20260308192546.001_20260308_192629 Paper: self.20260308192546.001	Adaptive-Resolution State Cache (ARSC) Benchmark README.md Adaptive-Resolution State Cache (ARSC) Benchmark This repository contains a minimal, runnable benchmark for the Adaptive-Resolution State Cache (ARSC) innovation. Concept Standard State Space Models (SSMs) like Mamba maintain...	03-08 19:26	Success	-	View
exp_pytrain.20260308192403.001_20260308_192439 Paper: pytrain.20260308192403.001	Runtime-Validated Plugin Registry Benchmark README.md Runtime-Validated Plugin Registry Benchmark This benchmark demonstrates a robust, loosely coupled plugin architecture using Python's `typing.Protocol` for structural subtyping and `importlib` for dynamic runtime loading. Objective...	03-08 19:24	Success	-	View
exp_self.20260308190055.006_20260308_190124 Paper: self.20260308190055.006	Benchmark: CPU-Pinned State Swapping for Long Context README.md Benchmark: CPU-Pinned State Swapping for Long Context Overview This benchmark tests the hypothesis that an SSM (State Space Model) can handle arbitrarily long sequences (100k+ tokens) on limited VRAM (8GB) by offloading the "cold"...	03-08 19:01	Pending	-	View
exp_pytrain.20260308185926.003_20260308_185944 Paper: pytrain.20260308185926.003	This benchmark verifies the implementation of a robust dynamic plugin loader using Python's standard library. It demonst... README.md This benchmark verifies the implementation of a robust dynamic plugin loader using Python's standard library. It demonstrates structural sub-typing using `typing.Protocol` and runtime module discovery via `importlib`. Features 1....	03-08 18:59	Success	-	View
exp_self.20260308184721.005_20260308_184759 Paper: self.20260308184721.005	Benchmark: Asynchronous State Prefetch Pipeline README.md Benchmark: Asynchronous State Prefetch Pipeline Innovation: Asynchronous State Prefetch Pipeline Concept: Latency Hiding, Double Buffering, Pinned Memory Target: SSM / Mamba-like architectures with large context window...	03-08 18:58	Success	-	View
exp_self.20260308184533.004_20260308_184604 Paper: self.20260308184533.004	This repository contains a synthetic benchmark designed to validate the hypothesis that combining State Space Models (... README.md This repository contains a synthetic benchmark designed to validate the hypothesis that combining State Space Models (SSM), architectural caching optimizations, and dynamic precision techniques** yields superior memory efficienc...	03-08 18:46	Success	-	View
exp_pytrain.20260308184348.002_20260308_184418 Paper: pytrain.20260308184348.002	PEP 695 Generic Plugin Loader Benchmark Overview This benchmark evaluates the use of PEP 695 Type Parameter Syntax (introduced in Python 3.12) to define a generic base class for a dynamic plugin architecture. The Hypothesis Using the new syntax `class Base[T]:` (instead of `c...	03-08 18:44	Success	-	View
exp_self.20260308184213.003_20260308_184237 Paper: self.20260308184213.003	Associative State Retrieval (ASR) Benchmark This benchmark tests the hypothesis that offloading SSM state history to CPU RAM and retrieving it via dot-product attention improves long-context fidelity without exploding GPU VRAM usage. Dependencies - Python 3.8+ - PyTorch 2.0+ - numpy...	03-08 18:42	Success	-	View
exp_self.20260308184014.002_20260308_184045 Paper: self.20260308184014.002	Benchmark: Tiered Delta State Compression README.md Benchmark: Tiered Delta State Compression Overview This benchmark evaluates the "Tiered Delta State Compression" technique. This innovation aims to enable processing of significantly longer sequences (2x length) on fixed hardware...	03-08 18:41	Success	-	View
exp_self.20260308183811.001_20260308_183846 Paper: self.20260308183811.001	Innovation Benchmark: SSM + Cache + Dynamic Precision Co-design README.md Innovation Benchmark: SSM + Cache + Dynamic Precision Co-design Hypothesis Combining State Space Models (SSM), Caching (state persistence), and Dynamic Precision (Automatic Mixed Precision) will improve throughput and...	03-08 18:38	Success	-	View
exp_pytrain.20260308183640.001_20260308_183711 Paper: pytrain.20260308183640.001	Section 1: README.md bash python benchmark.py	03-08 18:37	Success	-	View