Welcome to Incels.is - Involuntary Celibate Forum

Welcome! This is a forum for involuntary celibates: people who lack a significant other. Are you lonely and wish you had someone in your life? You're not alone! Join our forum and talk to people just like you.

Theory idea for ai

Brainy

Brainy

Misser of Developmental Milestones
★★★★★
Joined
Jul 4, 2024
Posts
11,765
Online time
1d 5h
robot internally draws a virtual 3D map of everything it sees in real time, secondary ai uses reasoning and math to predict distances when drawing 3D map(agentic)

now that is has a map of what it sees, the secondary ai determines where is the most likely place the prompted task would be and steps

for example, "go to the kitchen and retrieve a spoon"

computer vision has this 3D map of the house

ai recognizes what a kitchen looks like from training data

stands in the kitchen

knows a spoon is likely in a drawer, and would be a single drawer at the topmost level, because reasoning

it opens several drawers and updates the 3D map in real time, simultaneously recognizing the contents in each drawer and internally deploying agents to analyze and debate on the objects seen

it uses that data and updates a workflow that it holds to memory

maybe this could bypass having to train an entirely new model
 
I think this is what google wants to do with Genie 3 sort of. The issue is while these task seem simple they require quite a bit of long horizon task ability and awareness that the computers lack
 
I think this is what google wants to do with Genie 3 sort of. The issue is while these task seem simple they require quite a bit of long horizon task ability and awareness that the computers lack
let's see if gpt-5 has a solution

Goal

Give the robot just enough long-horizon ability using simple, reliable pieces (no giant new model).


Architecture (minimal)

1. Task Graph (HTN-lite): Decompose goal → ordered/substitutable steps.


2. Skills Registry: Reusable atomic skills (navigate, open, detect, grasp).


3. Blackboard Memory: Shared state (goal, map, beliefs, checklist, timers).


4. Event FSM: Simple state machine with triggers (success, timeout, low-conf, collision).


5. Belief Tracker: P(object | place/container). Updates after each observation.


6. Replanner: Only fires on triggers; otherwise stick to plan (prevents thrash).



Representations (tiny but powerful)

Checklist: [reach kitchen] → [scan counters] → [open top drawers left→right] → [inspect utensil caddy] → [expand radius].

Belief table: {top_drawer: 0.55, utensil_caddy: 0.25, dish_rack: 0.1, others: 0.1}.

Stop rules: success OR belief_max < θ → widen search.

Failure codes: {NO_GRASP, NO_OPEN, NO_VISUAL, PATH_BLOCKED} → mapped to specific fixes.


Control loop (pseudo)

1. Read next checklist item.


2. Run skill; log outcome + confidence.


3. Update beliefs + blackboard.


4. If trigger → call replanner; else continue.


5. On success → compress trace to memory (home-specific prior update).



Triggers (keep it small)

Timeout per step (e.g., 20–40s).

Low confidence (<0.6 detection or belief).

Force anomaly while opening.

Localization drift (>0.3 m) → quick relocalize.


Reasoning shortcuts (cheap wins)

Affordance priors: “Spoon ↦ top drawer near sink/stove” (few dozen rules cover most objects).

Container ordering: top → middle → bottom; left→right.

Active perception: change pose if detection <0.6, then re-score.

Exploration bandit: pick next container by UCB on belief (fast, no full POMDP).

Memory bias: exponential moving average per home (fast personalization).


Example: “Get a spoon” (FSM)

1. GoTo(Kitchen) → Verify(Scene=Kitchen).


2. OpenSet = TopDrawers(kitchen, left→right).


3. For d in OpenSet: Open(d) → Detect(spoon) → if found: Grasp+Deliver; else Belief↓ and continue.


4. If none: Check(UtensilCaddy) → Check(DishRack) → Expand(SearchRadius=1.5m).


5. On any failure code → specific recovery (regrasp, re-open with higher force limit, change camera angle).



Guardrails

Budget caps: max drawers=8, max replans=3.

Safety envelope: collision watchdog + dynamic obstacle veto.

Explainability: keep a step log (“why here?” → “highest belief: 0.55 top drawer”).


Training/No-training mix

Use off-the-shelf vision (open-vocab detection/segmentation).

Tiny adapters only where needed (handle detection, grasp scorer).

Learn priors from experience (no end-to-end retraining).


Metrics (to know it works)

SR@N: success within N container opens.

Time-to-first-sighting.

# replans / task.

Collision & force limit trips.


Implementation order (fast path)

1. FSM + checklist + skills registry.


2. Belief table + UCB container selection.



3. Four triggers + tiny replanner.


4. Memory of successful locations per home.


5. Add more priors only if metrics stall.
 
let's see if gpt-5 has a solution

Goal

Give the robot just enough long-horizon ability using simple, reliable pieces (no giant new model).


Architecture (minimal)

1. Task Graph (HTN-lite): Decompose goal → ordered/substitutable steps.


2. Skills Registry: Reusable atomic skills (navigate, open, detect, grasp).


3. Blackboard Memory: Shared state (goal, map, beliefs, checklist, timers).


4. Event FSM: Simple state machine with triggers (success, timeout, low-conf, collision).


5. Belief Tracker: P(object | place/container). Updates after each observation.


6. Replanner: Only fires on triggers; otherwise stick to plan (prevents thrash).



Representations (tiny but powerful)

Checklist: [reach kitchen] → [scan counters] → [open top drawers left→right] → [inspect utensil caddy] → [expand radius].

Belief table: {top_drawer: 0.55, utensil_caddy: 0.25, dish_rack: 0.1, others: 0.1}.

Stop rules: success OR belief_max < θ → widen search.

Failure codes: {NO_GRASP, NO_OPEN, NO_VISUAL, PATH_BLOCKED} → mapped to specific fixes.


Control loop (pseudo)

1. Read next checklist item.


2. Run skill; log outcome + confidence.


3. Update beliefs + blackboard.


4. If trigger → call replanner; else continue.


5. On success → compress trace to memory (home-specific prior update).



Triggers (keep it small)

Timeout per step (e.g., 20–40s).

Low confidence (<0.6 detection or belief).

Force anomaly while opening.

Localization drift (>0.3 m) → quick relocalize.


Reasoning shortcuts (cheap wins)

Affordance priors: “Spoon ↦ top drawer near sink/stove” (few dozen rules cover most objects).

Container ordering: top → middle → bottom; left→right.

Active perception: change pose if detection <0.6, then re-score.

Exploration bandit: pick next container by UCB on belief (fast, no full POMDP).

Memory bias: exponential moving average per home (fast personalization).


Example: “Get a spoon” (FSM)

1. GoTo(Kitchen) → Verify(Scene=Kitchen).


2. OpenSet = TopDrawers(kitchen, left→right).


3. For d in OpenSet: Open(d) → Detect(spoon) → if found: Grasp+Deliver; else Belief↓ and continue.


4. If none: Check(UtensilCaddy) → Check(DishRack) → Expand(SearchRadius=1.5m).


5. On any failure code → specific recovery (regrasp, re-open with higher force limit, change camera angle).



Guardrails

Budget caps: max drawers=8, max replans=3.

Safety envelope: collision watchdog + dynamic obstacle veto.

Explainability: keep a step log (“why here?” → “highest belief: 0.55 top drawer”).


Training/No-training mix

Use off-the-shelf vision (open-vocab detection/segmentation).

Tiny adapters only where needed (handle detection, grasp scorer).

Learn priors from experience (no end-to-end retraining).


Metrics (to know it works)

SR@N: success within N container opens.

Time-to-first-sighting.

# replans / task.

Collision & force limit trips.


Implementation order (fast path)

1. FSM + checklist + skills registry.


2. Belief table + UCB container selection.



3. Four triggers + tiny replanner.


4. Memory of successful locations per home.


5. Add more priors only if metrics stall.
interesting what prompt did you feed it? did it use the thinking mode? chatgpt 5 is hit or miss with its router functionality
 
interesting what prompt did you feed it? did it use the thinking mode? chatgpt 5 is hit or miss with its router functionality
gpt-5 thinking

prompt: "The issue is while these task seem simple they require quite a bit of long horizon task ability and awareness that the computers lack"
 
@zd60

prompt: "my colleague said "chatgpt 5 is hit or miss with its router functionality" to this reply, what did he mean by that"


Meaning

“Hit or miss” = inconsistent.

“Router functionality” = the part that decides which tool/skill/agent to use next.


Two common “router” uses

1. Tool/function routing: pick between calling code, search, database, or a robot skill (navigate, open, grasp).


2. Expert/model routing: pick which sub-model/agent handles a subtask (mixture-of-experts).



Why it feels inconsistent

Tools overlap; triggers aren’t crystal-clear.

Prompts vary; small wording changes flip choices.

Long context → the model forgets constraints.

It may hallucinate instead of calling a tool or call the wrong one.

Temperature/stochasticity adds randomness.


Symptoms you’d see

Calls no tool when it should (e.g., “search” not invoked).

Calls the wrong skill (scans counters before checking drawers).

Loops/retries needlessly; ignores schemas/args.


In your robot example

Router must choose: Navigate → OpenDrawer → Detect → Grasp.

“Hit or miss” = it sometimes picks a poorer order, skips verification, or grasps without a solid detection.


Mitigations (practical)

1. Tight tool specs with mutually exclusive triggers (“If drawer closed → OpenDrawer, else Detect”).


2. Small rule/FSM front-end as the primary router; use the LLM only for ambiguous cases.


3. Few-shot examples showing correct tool choice and argument shapes.


4. Limit the tool set per task; remove near-duplicates.


5. Lower temperature; enforce max tool calls; add timeouts.


6. Validate choices (ask model to state: “I’m calling OpenDrawer because …”; auto-check the reason).


7. Backoff plan: if a call fails, retry with structured error; otherwise fall back to a safe default order.


8. Evals: create router tests (same goal, varied wording) and track accuracy over time.



Bottom line

Treat the LLM route
r as advisory. Keep a deterministic backbone for safety-critical steps.
 
gpt-5 thinking

prompt: "The issue is while these task seem simple they require quite a bit of long horizon task ability and awareness that the computers lack"
facsinating, mine just spewed out a list of things that are possible
 
facsinating, mine just spewed out a list of things that are possible
1) Hierarchical planning (planner → executor).
Decompose the goal into a DAG of sub-tasks with clear exit criteria. Let the model update a living “plan of record” instead of winging it.


2) External, persistent memory.
Keep facts, decisions, and open loops in a project workspace (notes/KB + vector search). Summaries + snapshots prevent context drift and allow resume after interruptions.


3) Tool use and stateful actions.
Give the AI functions/APIs (calendar, files, email, spreadsheets, web, databases) so it can act and read/write state, not just chat.


4) Progress checks and invariants.
Attach checklists, tests, and assertions to each sub-task (e.g., “budget column sums to total,” “dates don’t conflict”). Use an automatic “critic” step to verify outputs before moving on.


5) World/model of the task.
Encode constraints and relationships (deadlines, dependencies, capacities) as a lightweight schema/knowledge graph so the system can simulate consequences and detect conflicts early.
 
1) Hierarchical planning (planner → executor).
Decompose the goal into a DAG of sub-tasks with clear exit criteria. Let the model update a living “plan of record” instead of winging it.


2) External, persistent memory.
Keep facts, decisions, and open loops in a project workspace (notes/KB + vector search). Summaries + snapshots prevent context drift and allow resume after interruptions.


3) Tool use and stateful actions.
Give the AI functions/APIs (calendar, files, email, spreadsheets, web, databases) so it can act and read/write state, not just chat.


4) Progress checks and invariants.
Attach checklists, tests, and assertions to each sub-task (e.g., “budget column sums to total,” “dates don’t conflict”). Use an automatic “critic” step to verify outputs before moving on.


5) World/model of the task.
Encode constraints and relationships (deadlines, dependencies, capacities) as a lightweight schema/knowledge graph so the system can simulate consequences and detect conflicts early.
i've set settings so it is better adhd friendly and does bullet points
 
i've set settings so it is better adhd friendly and does bullet points
Oh I barley use mine I just have the subscription just this months to try it, I stick to local models or other platforms
 
Oh I barley use mine I just have the subscription just this months to try it, I stick to local models or other platforms
i gotcha, it's been so long since i've played around with local models

i should see what thebloke has uploaded recently
 
1) Hierarchical planning (planner → executor).
Decompose the goal into a DAG of sub-tasks with clear exit criteria. Let the model update a living “plan of record” instead of winging it.


2) External, persistent memory.
Keep facts, decisions, and open loops in a project workspace (notes/KB + vector search). Summaries + snapshots prevent context drift and allow resume after interruptions.


3) Tool use and stateful actions.
Give the AI functions/APIs (calendar, files, email, spreadsheets, web, databases) so it can act and read/write state, not just chat.


4) Progress checks and invariants.
Attach checklists, tests, and assertions to each sub-task (e.g., “budget column sums to total,” “dates don’t conflict”). Use an automatic “critic” step to verify outputs before moving on.


5) World/model of the task.
Encode constraints and relationships (deadlines, dependencies, capacities) as a lightweight schema/knowledge graph so the system can simulate consequences and detect conflicts early.
i prompted with that entire response:

  • Quick take
    • Strong foundations. Add a few pieces to make it reliable at long horizons.
  • Additions (fill the gaps)
    6) Belief/uncertainty tracking → keep P(state) and update after each action.
    7) Failure & recovery policy → retry/backoff trees, timeouts, fallbacks.
    8) Safety & permissions → scopes, rate/cost budgets, human approvals.
    9) Telemetry & evals → traces, regressions, router/tool accuracy tests.
    10) Scheduler/priorities → queue, SLAs, preemption across tasks.
  • Implementation (per your 5)
    1. Hierarchical planning
      • Represent plan as a DAG with node types: {goal, step, check, decision}.
      • Exit criteria = assertions; store status per node; replan only on triggers.
    2. External, persistent memory
      • KV store (facts), vector store (notes/docs), timeline (decisions/open loops).
      • Nightly snapshots + short rolling summary for context injection.
    3. Tool use & stateful actions
      • Strict function specs (arguments, idempotency, side-effect notes).
      • Transaction log: {tool, input, output, result, cost, latency}.
    4. Progress checks & invariants
      • Pre/post hooks: validate(), critic() before advancing DAG edge.
      • Auto-tests per step; block merge on any failed assertion.
    5. World/model of the task
      • Lightweight schema: entities, constraints, dependencies, capacities.
      • Fast solver: detect conflicts early; propose fixes (shift dates, split work).
  • Minimal data contracts (keep it simple)
    • PlanOfRecord: {goal, nodes:[{id,type,owner,inputs,assertions,status}], edges:[(id→id)]}.
    • Memory: {facts[], notes[], decisions[], open_loops[], snapshot}.
    • ToolCall: {name, args, preconds, postconds, retries, result, error}.
    • Constraint: {subject, relation, object, window, priority}.
  • Control loop (steady, not chatty)
    1. Expand next frontier node(s) in DAG.
    2. Execute tool/step.
    3. Update memory + beliefs.
    4. Run critic/assertions; if fail → recovery policy; else advance.
    5. Emit trace + metrics; repeat.
  • Guardrails
    • Budgets (time/cost/tool quotas), permissions, PII redaction, rollback on failed postconds.
  • Metrics to track
    • Task SR, step SR, mean retries, critic catch rate, router accuracy, cost/time per task.
  • Fast stack (example)
    • Orchestrator (FSM/HTN), DAG lib, KV + vector DB, function router, OR-Tools/CP-SAT for constraints, tracing/evals harness.
 
i gotcha, it's been so long since i've played around with local models

i should see what thebloke has uploaded recently
Local models have come quite a long way, the chinese models do the best so far. I've played around a bit with gpt-oss but it seems to be heavily trained on synthetic data like the phi models and kinda loses coherence fast outside of tool calls and instruction heavy task
 
i prompted with that entire response:

  • Quick take
    • Strong foundations. Add a few pieces to make it reliable at long horizons.
  • Additions (fill the gaps)
    6) Belief/uncertainty tracking → keep P(state) and update after each action.
    7) Failure & recovery policy → retry/backoff trees, timeouts, fallbacks.
    8) Safety & permissions → scopes, rate/cost budgets, human approvals.
    9) Telemetry & evals → traces, regressions, router/tool accuracy tests.
    10) Scheduler/priorities → queue, SLAs, preemption across tasks.
  • Implementation (per your 5)
    1. Hierarchical planning
      • Represent plan as a DAG with node types: {goal, step, check, decision}.
      • Exit criteria = assertions; store status per node; replan only on triggers.
    2. External, persistent memory
      • KV store (facts), vector store (notes/docs), timeline (decisions/open loops).
      • Nightly snapshots + short rolling summary for context injection.
    3. Tool use & stateful actions
      • Strict function specs (arguments, idempotency, side-effect notes).
      • Transaction log: {tool, input, output, result, cost, latency}.
    4. Progress checks & invariants
      • Pre/post hooks: validate(), critic() before advancing DAG edge.
      • Auto-tests per step; block merge on any failed assertion.
    5. World/model of the task
      • Lightweight schema: entities, constraints, dependencies, capacities.
      • Fast solver: detect conflicts early; propose fixes (shift dates, split work).
  • Minimal data contracts (keep it simple)
    • PlanOfRecord: {goal, nodes:[{id,type,owner,inputs,assertions,status}], edges:[(id→id)]}.
    • Memory: {facts[], notes[], decisions[], open_loops[], snapshot}.
    • ToolCall: {name, args, preconds, postconds, retries, result, error}.
    • Constraint: {subject, relation, object, window, priority}.
  • Control loop (steady, not chatty)
    1. Expand next frontier node(s) in DAG.
    2. Execute tool/step.
    3. Update memory + beliefs.
    4. Run critic/assertions; if fail → recovery policy; else advance.
    5. Emit trace + metrics; repeat.
  • Guardrails
    • Budgets (time/cost/tool quotas), permissions, PII redaction, rollback on failed postconds.
  • Metrics to track
    • Task SR, step SR, mean retries, critic catch rate, router accuracy, cost/time per task.
  • Fast stack (example)
    • Orchestrator (FSM/HTN), DAG lib, KV + vector DB, function router, OR-Tools/CP-SAT for constraints, tracing/evals harness.
Pretty interesting. I do think that these models are just the stepping stones to a future full of real AGI/ASI and that LLM's won't be going anywere but we will need some sort of new breakthrough in technology to get past the hurtles I think we will hurt in the next s-curve
 
Local models have come quite a long way, the chinese models do the best so far. I've played around a bit with gpt-oss but it seems to be heavily trained on synthetic data like the phi models and kinda loses coherence fast outside of tool calls and instruction heavy task
i will have to check them out
 
i will have to check them out
Unless you have a really good computer you'll probably be restricted to the gpt-oss-20b model because the 120 model might be a little tough to run but it can still be good for some task just more akin to the phi models seems heavily synthetically trained
 
Unless you have a really good computer you'll probably be restricted to the gpt-oss-20b model because the 120 model might be a little tough to run but it can still be good for some task just more akin to the phi models seems heavily synthetically trained
i have a 3090 but that's it

Pretty interesting. I do think that these models are just the stepping stones to a future full of real AGI/ASI and that LLM's won't be going anywere but we will need some sort of new breakthrough in technology to get past the hurtles I think we will hurt in the next s-curve
we'll see where things go, especially with google and china
 
i have a 3090 but that's it


we'll see where things go, especially with google and china
I run a 3090 too.

google has more than they are letting on. Openai still has a small lead though overall across most benchmarks and use cases.
 
I run a 3090 too.

google has more than they are letting on. Openai still has a small lead though overall across most benchmarks and use cases.
definitely google has something under their sleeve, and i think zuck knows it if he's frantically paying that much money for skill
 

Similar threads

A
Replies
11
Views
499
atae
A
AsiaCel
Replies
1
Views
392
AdiPajeetIncel
AdiPajeetIncel
fukurou
Replies
2
Views
394
fukurou
fukurou
Misogynist Vegeta
Replies
46
Views
1K
slavcel11
slavcel11

Users who are viewing this thread

shape1
shape2
shape3
shape4
shape5
shape6
Back
Top