Graphory logo

Graphory Labs

Custom Connectors

Connect any data source to your knowledge graph - not just the 250+ built-in integrations. If it has an API, a webhook, or can export data, it can feed your graph.

Two Integration Models

Graphory supports two ways to bring in custom data. Choose based on your use case:

📤

Push Model (Ingest API)

Your system sends data to Graphory whenever new data is available. Best for webhooks, event-driven systems, and one-time imports.

📥

Pull Model (Custom Collector)

A script fetches data from your API on a schedule. Best for APIs that need polling, batch jobs, and recurring data sync.

Push Model: Ingest API

The fastest way to get custom data in. Send a POST request to the Ingest API from any system that can make HTTP requests. No collector setup, no scheduling - just push and go.

Supported source types

Example: REST API

Fetch data from any REST API and push it to your graph:

Python
import requests

# Fetch from your internal API
response = requests.get(
    "https://internal.yourcompany.com/api/deals",
    headers={"Authorization": "Bearer your_internal_key"}
)
deals = response.json()

# Push each deal to Graphory
for deal in deals:
    requests.post(
        "https://api.graphory.io/ingest",
        headers={"Authorization": "Bearer gs_ak_your_api_key"},
        json={
            "entity": "your-org",
            "source": "internal-crm",
            "title": deal["name"],
            "body": f"Stage: {deal['stage']}. Value: ${deal['value']}. Contact: {deal['contact']}",
            "type": "deal",
            "date": deal["updated_at"][:10]
        }
    )
    print(f"Ingested: {deal['name']}")

Example: GraphQL API

Python
import requests

# Query your GraphQL endpoint
query = """
query {
  projects(status: ACTIVE) {
    id
    name
    description
    updatedAt
    team { name }
  }
}
"""
result = requests.post(
    "https://api.linear.app/graphql",
    headers={"Authorization": "Bearer lin_your_key"},
    json={"query": query}
).json()

# Push to Graphory
for project in result["data"]["projects"]:
    team_names = ", ".join(m["name"] for m in project["team"])
    requests.post(
        "https://api.graphory.io/ingest",
        headers={"Authorization": "Bearer gs_ak_your_api_key"},
        json={
            "entity": "your-org",
            "source": "linear",
            "title": project["name"],
            "body": f"{project['description']}\nTeam: {team_names}",
            "type": "project",
            "date": project["updatedAt"][:10]
        }
    )

Example: Webhook Relay

For webhook-driven sources (Stripe events, GitHub webhooks, etc.), set up a simple relay:

Node.js (Express)
const express = require('express');
const app = express();
app.use(express.json());

app.post('/webhook/stripe', async (req, res) => {
  const event = req.body;

  await fetch('https://api.graphory.io/ingest', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer gs_ak_your_api_key',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      entity: 'your-org',
      source: 'stripe',
      title: `${event.type}: ${event.data.object.id}`,
      body: JSON.stringify(event.data.object, null, 2),
      type: event.type,
      date: new Date(event.created * 1000).toISOString().slice(0, 10),
      idempotency_key: event.id
    })
  });

  res.json({ received: true });
});

app.listen(3000);

Example: CSV Import

Python
import csv
import requests

GRAPHORY_KEY = "gs_ak_your_api_key"

with open("contacts.csv") as f:
    reader = csv.DictReader(f)
    items = []
    for row in reader:
        items.append({
            "entity": "your-org",
            "source": "csv-import",
            "title": f"{row['first_name']} {row['last_name']}",
            "body": f"Email: {row['email']}\nCompany: {row['company']}\nRole: {row['role']}",
            "type": "contact",
            "idempotency_key": f"csv-{row['email']}"
        })

# Batch ingest (up to 100 items per request)
for i in range(0, len(items), 100):
    batch = items[i:i+100]
    resp = requests.post(
        "https://api.graphory.io/ingest",
        headers={"Authorization": f"Bearer {GRAPHORY_KEY}"},
        json={"items": batch}
    )
    print(f"Batch {i//100 + 1}: {resp.json()}")

Example: Bash / curl

bash
# Quick one-liner to push a note
curl -X POST https://api.graphory.io/ingest \
  -H "Authorization: Bearer gs_ak_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "entity": "your-org",
    "source": "manual",
    "title": "Meeting notes - Product review",
    "body": "Decided to launch v2 next month. Key blocker: API docs need updating.",
    "type": "note"
  }'

Pull Model: Custom Collectors

For APIs that need regular polling, build a collector script and run it on a schedule. This is how Graphory's built-in connectors work internally.

How it works

Write a collector script

A script that authenticates with your API, fetches new data, and pushes it to the Ingest API. Can be in any language.

Track what you have seen

Use idempotency keys or a local state file to avoid re-ingesting the same data on each run.

Schedule it

Run on a cron schedule, a GitHub Action, or any task scheduler. Hourly, daily, or whatever fits your data.

Example: Custom API Collector

Python
#!/usr/bin/env python3
"""Custom collector for an internal ticket system."""
import os
import json
import requests
from datetime import datetime, timedelta

GRAPHORY_KEY = os.getenv("GRAPHORY_API_KEY")
TICKET_API = os.getenv("TICKET_API_URL")
TICKET_TOKEN = os.getenv("TICKET_API_TOKEN")
STATE_FILE = "last_sync.json"

def load_state():
    try:
        with open(STATE_FILE) as f:
            return json.load(f)
    except FileNotFoundError:
        return {"last_sync": (datetime.utcnow() - timedelta(days=30)).isoformat()}

def save_state(state):
    with open(STATE_FILE, "w") as f:
        json.dump(state, f)

def collect():
    state = load_state()
    since = state["last_sync"]

    # Fetch new tickets since last sync
    resp = requests.get(
        f"{TICKET_API}/tickets",
        headers={"Authorization": f"Bearer {TICKET_TOKEN}"},
        params={"updated_since": since, "limit": 100}
    )
    tickets = resp.json()["tickets"]

    if not tickets:
        print("No new tickets.")
        return

    # Push to Graphory
    items = []
    for t in tickets:
        items.append({
            "entity": "your-org",
            "source": "internal-tickets",
            "title": f"[{t['id']}] {t['subject']}",
            "body": f"Status: {t['status']}\nPriority: {t['priority']}\nAssigned: {t['assignee']}\n\n{t['description']}",
            "type": "ticket",
            "date": t["updated_at"][:10],
            "idempotency_key": f"ticket-{t['id']}-{t['updated_at']}"
        })

    resp = requests.post(
        "https://api.graphory.io/ingest",
        headers={"Authorization": f"Bearer {GRAPHORY_KEY}"},
        json={"items": items}
    )
    print(f"Ingested {len(items)} tickets: {resp.json()}")

    # Update state
    save_state({"last_sync": datetime.utcnow().isoformat()})

if __name__ == "__main__":
    collect()

Schedule with cron:

crontab
# Run every hour
0 * * * * cd /path/to/collector && python3 collect_tickets.py >> collect.log 2>&1

Best Practices

Always use idempotency keys

Prevent duplicate data by including an idempotency_key with every ingest call. Use a combination of the source system's ID and a timestamp or version number.

Batch when possible

The Ingest API accepts up to 100 items per request in the items array. Batching reduces HTTP overhead and is faster than individual requests.

Structure your body text

The extractor works best with clear, labeled fields in the body text. Use a format like Field: Value on separate lines.

Good body format
Name: Sarah Chen
Company: Beta Industries
Role: VP of Engineering
Email: sarah@beta.io
Met at: SaaStr 2026
Notes: Interested in enterprise plan. Follow up next week.

Choose meaningful source names

The source field helps you filter and identify where data came from. Use consistent, descriptive names like "internal-crm", "stripe-webhooks", or "csv-import-2026-04".

Handle errors gracefully

The Ingest API returns standard HTTP status codes. Implement retry logic for 429 (rate limited) and 500 (server error) responses. 400 errors indicate bad data and should be logged for investigation.

What Can You Connect?

If it produces data, it can feed your graph. Common custom sources include:

REST APIs

Any API that returns JSON or XML. Internal tools, SaaS platforms, IoT endpoints.

GraphQL APIs

Query exactly what you need from Linear, GitHub, Hasura, or custom GraphQL servers.

Webhooks

Stripe events, GitHub pushes, Shopify orders, Twilio messages - relay them to Ingest.

CSV / Excel

Bulk import from spreadsheets. CRM exports, financial reports, contact lists.

Databases

Query Postgres, MySQL, MongoDB, or any database and push results to your graph.

AI Conversations

Push insights from Claude, ChatGPT, or other AI conversations back into your graph.

Need help building a custom connector? Contact us - we can help with architecture and best practices for your specific data source.