Introduction to Spykio

Spykio provides high-precision document retrieval capabilities with accuracy that exceeds traditional RAG (Retrieval Augmented Generation) systems. Our API allows you to upload documents and query them with natural language to get precise answers and references.

Beta Access
Spykio is currently in beta. Expect some rough edges and bugs.

Quick Start

Install our client library for your platform and start using Spykio in minutes.

Installation

Terminal (npm)
$ npm install spykio-client

Basic usage

JavaScript
const { SpykioClient } = require('spykio-client'); // CommonJS
// or
import { SpykioClient } from 'spykio-client'; // ES Modules

const client = new SpykioClient({
  apiKey: 'your-api-key'
});

// Search in an index
const result = await client.query.search({
  index: 'your-index-name',
  userQuery: 'What hotel did I stay at in Warsaw?'
});

// Upload a file to an index
const uploadResult = await client.files.upload({
  index: 'your-index-name',
  mimeType: 'application/pdf',
  base64String: 'base64EncodedString'
});

Installation

Client Libraries

Spykio provides official client libraries for several programming languages:

Terminal
npm install spykio-client

Our Node.js client works in both browser and server environments.

API Keys

To use Spykio, you'll need an API key. You can find your API key in the dashboard keys.

API Key Security
Keep your API key secure and never expose it in client-side code. Use environment variables or a secure backend to store your keys.

Document Upload

Spykio supports uploading various document formats, including PDF, DOCX, TXT, HTML, and more. You can upload documents via the API or through our dashboard interface.

API Upload

JavaScript
// Upload using base64 string
const result = await client.files.upload({
  index: 'your-index-name',
  mimeType: 'application/pdf', // MIME type of the file
  base64String: 'base64EncodedString' // Base64 encoded file content
});
  
// Upload using text content
const textResult = await client.files.upload({
  index: 'your-index-name',
  content: 'This is the text content to be uploaded.'
});

Upload Options

ParameterTypeRequiredDescription
indexstringYesThe index to upload the document to
contentstringNo*The document content do not use this for file uploads
mimeTypestringNo*The MIME type of the file
base64StringobjectNo*The base64 encoded file content
* Use either content or fileupload, not both one is required but not both at the same time, when using file upload make sure to provide the mimeType

Supported Document Formats

PDF Documents

.pdf

Word Documents

.doc, .docx

Text Files

.txt

Spreadsheets

.csv, .xlsx

JSON Data

.json

HTML/Web Content

.html, .htm

Querying Documents

Spykio's document retrieval excels at understanding the semantic meaning of your queries and returning precisely relevant document sections.

Basic Queries

JavaScript
// Basic search with default parameters
const result = await client.query.search({
  index: 'your-index-name',
  userQuery: 'What hotel did I stay at in Warsaw?'
});

// Search with accurate match and relevant info
const detailedResult = await client.query.search({
  index: 'your-index-name',
  userQuery: 'What hotel did I stay at in Warsaw?',
  accurateMatch: true,
  getRelevantInfo: true
});

// Print results
console.log('Search results:', result);

Query Parameters

ParameterTypeRequiredDescription
indexstringYesThe name of the index to search in
userQuerystringYesThe query text to search for
accurateMatchbooleanNoWhen true, performs more precise matching (default: false)
getRelevantInfobooleanNoWhen true, returns specific relevant sections (default: false)
topNnumberNoMaximum number of documents to analyze for extracting relevant sections (default: 5)
maxResultsnumberNoMaximum number of documents to return (default: 20)

Listing Documents

Spykio allows you to retrieve a paginated list of all documents in an index, making it easy to browse and manage your documents.

Retrieving Document Lists

JavaScript
// List all documents in an index with pagination
const result = await client.documents.list({
  index: 'your-index-name',
  region: 'EU',     // Optional, defaults to 'EU'
  limit: 20,        // Optional, defaults to 100
  offset: 0         // Optional, defaults to 0
});

// Access documents and pagination info
console.log('Documents:', result.documents);
console.log('Pagination info:', result.pagination);

Pagination

When dealing with large collections of documents, you can use the limit and offset parameters to paginate through the results.

Response Structure
  • documents - Array of document objects with id, content, summary, and created_at properties
  • pagination - Object with pagination details:
    • total - Total number of documents in the index
    • limit - Maximum number of documents returned
    • offset - Number of documents skipped
    • hasMore - Boolean indicating if there are more documents available

Extracting Document Information

When you need to extract specific information from a known document, you can use the document extraction feature. This allows you to query a specific document with natural language and get precisely relevant sections.

Extracting From a Specific Document

JavaScript
// Extract specific information from a document
const result = await client.documents.extract({
  index: 'your-index-name',
  documentId: 'abc123-example-id',
  userQuery: 'What is the hotel address?',
  region: 'EU'       // Optional, defaults to 'EU'
});

// Access the extracted information
console.log('Document summary:', result.documentSummary);
console.log('Relevant sections:', result.relevantSections);

Use Cases

Document extraction is particularly useful when:

  • You already know which document contains the information you need
  • You want to extract specific details from a large document
  • You need to analyze the content of a particular document in depth
  • You're creating document-specific chatbots or knowledge assistants
Performance Tip
Extracting from a specific document is faster and more efficient than querying across your entire index when you know which document contains the information you need.

OpenAI Function Calling with Spykio

You can enhance your OpenAI-powered applications by using Spykio's document retrieval as a function call. This integration allows your AI to retrieve precisely relevant information from your documents to provide accurate and contextual responses.

Setting Up OpenAI Function Calling

JavaScript
// 1. Set up your clients
const { SpykioClient } = require('spykio-client');
const { OpenAI } = require('openai');

const spykio = new SpykioClient({
  apiKey: 'your-spykio-api-key'
});

const openai = new OpenAI({
  apiKey: 'your-openai-api-key'
});

// 2. Define the Spykio search function - only let OpenAI decide the query
const functions = [
  {
    name: 'searchDocuments',
    description: 'Search through documents to find relevant information',
    parameters: {
      type: 'object',
      properties: {
        query: {
          type: 'string',
          description: 'The search query to find information in the documents'
        }
      },
      required: ['query']
    }
  }
];

// 3. Implement the function to query Spykio - with hardcoded parameters
async function searchDocuments(query) {
  // Index and search parameters are controlled by the application, not OpenAI
  const INDEX = 'your-index-name';
  const ACCURATE_MATCH = true;
  const GET_RELEVANT_INFO = true;
  
  try {
    const results = await spykio.query.search({
      index: INDEX,
      userQuery: query,
      accurateMatch: ACCURATE_MATCH,
      getRelevantInfo: GET_RELEVANT_INFO
    });
    
    // Process results based on response format
    // When accurateMatch and getRelevantInfo are true, we get relevantSections
    // Otherwise, we get documents array
    if (results.relevantSections) {
      return {
        type: 'relevantSections',
        sections: results.relevantSections.map(section => ({
          text: section.text,
          relevanceScore: section.relevanceScore,
          explanation: section.explanation,
          documentId: section.documentId
        })),
        metrics: results.metrics
      };
    } else if (results.documents) {
      return {
        type: 'documents',
        documents: results.documents.map(doc => ({
          id: doc.id,
          summary: doc.summary,
          keywords: doc.keywords,
          created_at: doc.created_at
        })),
        metrics: results.metrics
      };
    }
    
    return results;
  } catch (error) {
    console.error('Error searching documents:', error);
    return { error: 'Failed to search documents' };
  }
}

// 4. Main chat function with function calling
async function chat(userMessage) {
  try {
    // First, send the user's message to OpenAI
    const response = await openai.chat.completions.create({
      model: 'gpt-4o',
      messages: [
        { role: 'system', content: 'You are a helpful assistant. Use the searchDocuments function when you need to find specific information.' },
        { role: 'user', content: userMessage }
      ],
      tools: [
        {
          type: 'function',
          function: functions[0]
        }
      ],
      tool_choice: 'auto'
    });
    
    const responseMessage = response.choices[0].message;
    
    // Check if OpenAI wants to call a function
    if (responseMessage.tool_calls) {
      // Parse the function call
      const toolCall = responseMessage.tool_calls[0];
      const functionName = toolCall.function.name;
      const functionArgs = JSON.parse(toolCall.function.arguments);
      
      let functionResponse;
      if (functionName === 'searchDocuments') {
        // Call Spykio search with only the query from OpenAI
        functionResponse = await searchDocuments(functionArgs.query);
      }
      
      // Send the function result back to OpenAI
      const secondResponse = await openai.chat.completions.create({
        model: 'gpt-4o',
        messages: [
          { role: 'system', content: 'You are a helpful assistant. When citing information, mention the source.' },
          { role: 'user', content: userMessage },
          responseMessage,
          { 
            role: 'tool', 
            tool_call_id: toolCall.id,
            content: JSON.stringify(functionResponse)
          }
        ]
      });
      
      return secondResponse.choices[0].message.content;
    } else {
      // If no function call was made, return the original response
      return responseMessage.content;
    }
  } catch (error) {
    console.error('Error in chat function:', error);
    return 'Sorry, I encountered an error while processing your request.';
  }
}

// 5. Example usage
async function example() {
  const result = await chat("What information do we have about project timelines?");
  console.log(result);
}

example();

How It Works

This integration works by defining a Spykio search function that OpenAI can call when it needs specific information from your documents. The function definition provides OpenAI with the parameters it needs to make an effective query.

Integration Flow
  1. Define the Spykio search as an available function in your OpenAI request
  2. When OpenAI determines it needs document information, it will call the Spykio function
  3. Your application handles the function call by querying Spykio
  4. Return the relevant document sections to OpenAI
  5. OpenAI incorporates this information into its response
Response Processing
This example handles both response formats from Spykio. When accurateMatch and getRelevantInfo are true, it processes relevantSections. Otherwise, it uses the documents array.

Claude Tool Calling with Spykio

Anthropic's Claude AI models support tool calling, allowing you to integrate Spykio's document retrieval capabilities directly into your Claude-powered applications.

Setting Up Claude Tool Calling

JavaScript
// 1. Set up your clients
const { SpykioClient } = require('spykio-client');
const Anthropic = require('@anthropic-ai/sdk');

const spykio = new SpykioClient({
  apiKey: 'your-spykio-api-key'
});

const anthropic = new Anthropic({
  apiKey: 'your-anthropic-api-key'
});

// 2. Define the Spykio search tool
const tools = [
  {
    name: 'search_documents',
    description: 'Search through documents to find relevant information',
    input_schema: {
      type: 'object',
      properties: {
        query: {
          type: 'string',
          description: 'The search query to find information in the documents'
        }
      },
      required: ['query']
    }
  }
];

// 3. Implement the function to query Spykio
async function searchDocuments(query) {
  // Index and search parameters are controlled by the application, not Claude
  const INDEX = 'your-index-name';
  const ACCURATE_MATCH = true;
  const GET_RELEVANT_INFO = true;
  
  try {
    const results = await spykio.query.search({
      index: INDEX,
      userQuery: query,
      accurateMatch: ACCURATE_MATCH,
      getRelevantInfo: GET_RELEVANT_INFO
    });
    
    // Process results based on response format
    // When accurateMatch and getRelevantInfo are true, we get relevantSections
    // Otherwise, we get documents array
    if (results.relevantSections) {
      return {
        type: 'relevantSections',
        sections: results.relevantSections.map(section => ({
          text: section.text,
          relevanceScore: section.relevanceScore,
          explanation: section.explanation,
          documentId: section.documentId
        })),
        metrics: results.metrics
      };
    } else if (results.documents) {
      return {
        type: 'documents',
        documents: results.documents.map(doc => ({
          id: doc.id,
          summary: doc.summary,
          keywords: doc.keywords,
          created_at: doc.created_at
        })),
        metrics: results.metrics
      };
    }
    
    return results;
  } catch (error) {
    console.error('Error searching documents:', error);
    return { error: 'Failed to search documents' };
  }
}

// 4. Main chat function with Claude tool calling
async function chat(userMessage) {
  try {
    // First, send the user's message to Claude
    const response = await anthropic.messages.create({
      model: 'claude-3-opus-20240229',
      max_tokens: 1000,
      messages: [
        { role: 'user', content: userMessage }
      ],
      system: 'You are a helpful assistant. Use the search_documents tool when you need to find specific information.',
      tools: tools
    });
    
    // Check if Claude wants to call a tool
    if (response.content[0].type === 'tool_use') {
      const toolUse = response.content[0];
      const toolName = toolUse.name;
      const input = toolUse.input;
      
      let toolResponse;
      if (toolName === 'search_documents') {
        // Call Spykio search with Claude's query
        toolResponse = await searchDocuments(input.query);
      }
      
      // Send the tool result back to Claude
      const secondResponse = await anthropic.messages.create({
        model: 'claude-3-opus-20240229',
        max_tokens: 1000,
        messages: [
          { role: 'user', content: userMessage },
          { role: 'assistant', content: [toolUse] },
          { 
            role: 'user', 
            content: [
              {
                type: 'tool_result',
                tool_use_id: toolUse.id,
                content: JSON.stringify(toolResponse)
              }
            ]
          }
        ],
        system: 'You are a helpful assistant. When citing information, mention the source.'
      });
      
      // Return Claude's final response
      return secondResponse.content.map(c => 
        c.type === 'text' ? c.text : ''
      ).join('');
    } else {
      // If no tool call was made, return the original response
      return response.content[0].text;
    }
  } catch (error) {
    console.error('Error in chat function:', error);
    return 'Sorry, I encountered an error while processing your request.';
  }
}

// 5. Example usage
async function example() {
  const result = await chat("What information do we have about project timelines?");
  console.log(result);
}

example();

Key Differences from OpenAI

While the concept is similar to OpenAI's function calling, Claude's implementation has a few differences in how tools are defined and results are processed:

  • Tools use input_schema instead of parameters
  • Tool responses are handled with tool_use and tool_result types
  • Claude's content field is an array of content blocks with different types
Model Support
Tool calling requires Claude 3 Opus, Claude 3 Sonnet, or newer models. This example uses Claude 3 Opus.

OpenRouter Tool Calling with Spykio

OpenRouter provides a unified API for accessing a wide range of AI models from different providers. You can integrate Spykio's document retrieval capabilities with any OpenRouter model that supports tool calling.

Setting Up OpenRouter Tool Calling

JavaScript
// 1. Set up your clients
const { SpykioClient } = require('spykio-client');
const OpenAI = require('openai');

const spykio = new SpykioClient({
  apiKey: 'your-spykio-api-key'
});

// Initialize OpenRouter with OpenAI-compatible SDK
const openRouter = new OpenAI({
  apiKey: 'your-openrouter-api-key',
  baseURL: 'https://openrouter.ai/api/v1'
});

// 2. Define the Spykio search tool
const tools = [
  {
    type: 'function',
    function: {
      name: 'searchDocuments',
      description: 'Search through documents to find relevant information',
      parameters: {
        type: 'object',
        properties: {
          query: {
            type: 'string',
            description: 'The search query to find information in the documents'
          }
        },
        required: ['query']
      }
    }
  }
];

// 3. Implement the function to query Spykio
async function searchDocuments(query) {
  // Index and search parameters are controlled by the application, not the AI
  const INDEX = 'your-index-name';
  const ACCURATE_MATCH = true;
  const GET_RELEVANT_INFO = true;
  
  try {
    const results = await spykio.query.search({
      index: INDEX,
      userQuery: query,
      accurateMatch: ACCURATE_MATCH,
      getRelevantInfo: GET_RELEVANT_INFO
    });
    
    // Process results based on response format
    if (results.relevantSections) {
      return {
        type: 'relevantSections',
        sections: results.relevantSections.map(section => ({
          text: section.text,
          relevanceScore: section.relevanceScore,
          explanation: section.explanation,
          documentId: section.documentId
        })),
        metrics: results.metrics
      };
    } else if (results.documents) {
      return {
        type: 'documents',
        documents: results.documents.map(doc => ({
          id: doc.id,
          summary: doc.summary,
          keywords: doc.keywords,
          created_at: doc.created_at
        })),
        metrics: results.metrics
      };
    }
    
    return results;
  } catch (error) {
    console.error('Error searching documents:', error);
    return { error: 'Failed to search documents' };
  }
}

// 4. Main chat function with OpenRouter tool calling
async function chat(userMessage, model = 'google/gemini-2.0-flash-001') {
  try {
    // First, send the user's message to the AI
    const messages = [
      { role: 'system', content: 'You are a helpful assistant. Use the searchDocuments tool when you need to find specific information.' },
      { role: 'user', content: userMessage }
    ];
    
    let response = await openRouter.chat.completions.create({
      model: model, // You can use any model OpenRouter supports with tool calling
      messages: messages,
      tools: tools
    });
    
    let responseMessage = response.choices[0].message;
    messages.push(responseMessage);
    
    // Check if the model wants to call a tool
    if (responseMessage.tool_calls) {
      // Process each tool call (usually just one)
      for (const toolCall of responseMessage.tool_calls) {
        const functionName = toolCall.function.name;
        const functionArgs = JSON.parse(toolCall.function.arguments);
        
        let functionResponse;
        if (functionName === 'searchDocuments') {
          // Call Spykio search with the query from the AI
          functionResponse = await searchDocuments(functionArgs.query);
        }
        
        // Append the tool result to messages
        messages.push({
          role: 'tool',
          tool_call_id: toolCall.id,
          name: functionName,
          content: JSON.stringify(functionResponse)
        });
      }
      
      // Send the tool results back to the AI for final response
      const finalResponse = await openRouter.chat.completions.create({
        model: model,
        messages: messages,
        tools: tools
      });
      
      return finalResponse.choices[0].message.content;
    } else {
      // If no tool call was made, return the original response
      return responseMessage.content;
    }
  } catch (error) {
    console.error('Error in chat function:', error);
    return 'Sorry, I encountered an error while processing your request.';
  }
}

// 5. Example usage
async function example() {
  // Choose from any OpenRouter model that supports tool calling
  // Examples: google/gemini-2.0-flash-001, anthropic/claude-3-opus, openai/gpt-4o
  const result = await chat("What information do we have about project timelines?", "google/gemini-2.0-flash-001");
  console.log(result);
}

example();

Model Flexibility

One of the main advantages of using OpenRouter is the ability to switch between different LLM providers while maintaining a consistent API. You can experiment with various models to find the best one for your use case.

Supported Models
  • google/gemini-2.0-flash-001 - Fast and cost-effective tool calling
  • anthropic/claude-3.7-sonnet - Highest accuracy and reasoning
  • openai/gpt-4o - Well-balanced performance
  • Many other models as OpenRouter adds support
OpenRouter API Key
You'll need to create an account on OpenRouter.ai to get an API key. The example uses the OpenAI-compatible client library but points to the OpenRouter API endpoint.