NEXUS Alpha generates long-only investment signals (BUY, OVERWEIGHT, NEUTRAL) by applying FinBERT-based sentiment analysis to SEC EDGAR filings — 10-K annual reports, 10-Q quarterlies, and 8-K earnings releases — supplemented by large language model enrichment. Signals are produced within 10 minutes of a new filing appearing on EDGAR.
Raw HTML from EDGAR is parsed using beautifulsoup4. For 10-K/10-Q, the MD&A section is isolated using regex pattern matching on Item 7 headers. For 8-K, Item 2.02 and Item 7.01 are extracted. Maximum input to NLP is 15,000 characters to stay within FinBERT's effective context window.
ProsusAI/finbert — a BERT model fine-tuned on 10,000 financial sentences from analyst reports and earnings calls. Outputs three class probabilities: Positive, Negative, Neutral. Net sentiment is computed as Positive − Negative, ranging from −1.0 to +1.0.
Long text is split into 512-token chunks, scored independently, and averaged by token count. This preserves sentiment signal from both the beginning and the end of the MD&A, which contain different types of information (historical vs forward-looking).
Optional second-stage enrichment generates a structured investment thesis. The prompt asks for: signal direction, confidence, bull case, bear case, key risks, revenue tone (BEAT/MISS/IN-LINE for 8-K), and guidance tone (RAISED/LOWERED/MAINTAINED).
Production uses qwen2.5:14b via Ollama (local, zero marginal cost). The enrichment model override is suppressed if LLM confidence is below 0.60 threshold.
| Net Sentiment | Trend Adjustment | Raw Signal |
|---|---|---|
> 0.30 | — | BUY |
0.10 – 0.30 | — | OVERWEIGHT |
−0.10 – 0.10 | — | NEUTRAL |
< −0.10 | — | NEUTRAL (long-only mode) |
Trend adjustment: The current net sentiment is compared against the rolling average of the prior 4 filings for the same ticker. A positive trend (improving management tone QoQ) upgrades NEUTRAL → OVERWEIGHT. Negative trend is suppressed in long-only mode.
Confidence cap at 0.75: Information coefficient (IC) analysis shows negative correlation between stated confidence and actual 60-day return above 0.75. Signals are capped at 0.75 to avoid over-promising on less reliable high-sentiment readings.
| Hold Period | Avg Return | vs SPY | Win Rate | Sharpe | Max DD |
|---|---|---|---|---|---|
| 20 days | +0.61% | −0.11% | 58.8% | 0.35 | −27.5% |
| 40 days | +3.63% | +1.34% | 58.8% | 1.13 | −15.4% |
| 60 days | +4.79% | +2.45% | 70.6% | 1.20 | −20.0% |
Key finding: The signal edge concentrates at longer holding periods. 20-day returns do not significantly beat SPY. The optimal holding period is 60 days, aligning with the quarterly reporting cycle.
| Confidence Band | Signal Count | Avg Return | Win Rate |
|---|---|---|---|
| 0.65 – 0.70 | 10 | +12.3% | 90% |
| 0.70 – 0.80 | 20 | +1.6% | 60% |
| > 0.80 (pre-cap historical) | 4 | +2.1% | 75% |
Note: The confidence cap at 0.75 was applied after this backtest. The >0.80 bucket reflects historical signals generated before the cap; no live signals will exceed 0.75.
| Signal | Count | Avg Return | Win Rate |
|---|---|---|---|
| BUY | 4 | +2.1% | 75% |
| OVERWEIGHT | 30 | +5.1% | 70% |
OVERWEIGHT signals — generated at moderate positive sentiment (net 0.10–0.30) with positive trend — comprise the majority of actionable signals and carry the strongest risk-adjusted return.
The backtest evaluated long-short mode (including SELL and UNDERWEIGHT signals). Long-short Sharpe was −0.22 vs long-only Sharpe of +1.20. SELL signals on large-cap US equities systematically underperform due to:
SELL and UNDERWEIGHT signals are generated internally but not exposed via the API in long-only mode (default for all tiers). Long-short mode is available on the Institutional tier upon request with a separate risk disclosure agreement.
Signals are delivered via three channels simultaneously upon generation:
GET /signals/top, /signals/history, /signals/watchlistwss://api.nexusalpha.io/ws/signals?api_key=... — pushed in real timeTarget: < 10 minutes from EDGAR filing acceptance to signal delivery. This is measured and published at /status. EDGAR updates RSS feeds every 10 minutes. Our watcher polls every 5 minutes, triggering the full NLP pipeline on first match.
| Layer | Source | Immutability |
|---|---|---|
| Raw filings | SEC EDGAR HTTPS | Cached on ingestion; hash-verified |
| NLP scores | FinBERT inference | Written once, never overwritten |
| Signals | Signal engine | Unique constraint on (ticker, signal_date); upserts only |
| Paper track | Live pipeline | Append-only; generated_at is immutable; no backdating |
| API key usage | Every API call | Append-only audit log in api_usage table |
| Endpoint | Tier | Description |
|---|---|---|
GET /signals/top | 1+ | Highest-confidence BUY/OVERWEIGHT signals |
GET /signals/history | 1+ | Full filterable signal time series |
GET /signals/watchlist | 1+ | NEUTRAL signals — monitoring list |
GET /analytics/performance | 1+ | Backtest stats + confidence bucket breakdown |
GET /filings/{ticker} | 2+ | Raw sentiment history per ticker |
GET /data/export | 2+ | Bulk CSV download — all signals |
GET /universe | 3 | Full universe snapshot — latest signal per ticker |
GET /paper-track | 1+ | Live paper track record with SPY comparison |
WS /ws/signals | 1+ | Real-time signal push via WebSocket |
NEXUS Alpha signals are generated by automated NLP models and are provided for informational and research purposes only. They do not constitute investment advice, a recommendation to buy or sell any security, or an offer of any investment product.
NEXUS Alpha is not a registered investment adviser under the Investment Advisers Act of 1940, nor is it registered with any equivalent regulatory body in any other jurisdiction. Past signal performance does not guarantee future results.
Users of NEXUS Alpha signals are solely responsible for their own investment decisions. NEXUS Alpha and its operators accept no liability for any losses arising from the use of these signals. By accessing the API, you confirm you have read and agree to the Terms of Service.