Connectivity Admin — Diagnóstico por Instância (Superadmin)

17.AA Connectivity Admin — Diagnóstico por Instância (Superadmin)#

Expansão da superficie de /v1/admin/connectivity/* com duas rotas novas que casam com o console "Connectivity" no frontend. Os endpoints originais (GET /v1/admin/connectivity/metrics e GET /v1/admin/connectivity/qr-diagnostics) permanecem válidos para consumidores antigos e estão documentados na seção 17.

  • Auth: JWT de superadmin em todos os endpoints.
  • Envelope de erro padrão (error_code, message, trace_id).
  • Shapes de janela aceitas: 15m, 1h, 6h, 24h, 7d. Valores fora desse conjunto caem no default de cada endpoint.

GET/v1/admin/connectivity/instances#

Rollup por instância com contagens agregadas de eventos de conexão numa janela configuravel. Usado pelo console de connectivity para render do grid principal com filtros + ordenacao.

Query params:

Param Tipo Default Significado
window 15m|1h|6h|24h|7d 24h Janela de agregacao dos contadores.
status CSV de status (CONNECTED,DISCONNECTED,...) Restringe instances[] aos status informados. Não afeta aggregate.
company_id uint Limita a uma empresa.
q string Busca parcial case-insensitive em name, phone, external_id.
limit int 50 (max 200) Paginação.
offset int 0 Paginação.
sort activity_desc|activity_asc|name_asc|flap_desc activity_desc Ordenacao após filtros.

Contadores counts_window (por instância):

Campo Fonte
connect_start connection.diagnostic com step=connect_start
connect_cooldown connection.diagnostic com step=connect_cooldown
connection_connected connection.update com status=connected
connection_disconnected connection.update com status=disconnected
connection_replaced connection.update com status=replaced (total — inclui self-caused)
stream_replaced_self connection.diagnostic com step=stream_replaced_self
flap_detected connection.diagnostic com step=flap_detected
websocket_closed connection.diagnostic com step=websocket_closed
keepalive_restored connection.{update,diagnostic}
keepalive_timeout connection.{update,diagnostic}
logged_out connection.update com status=logged_out
banned instance.banned OU connection.update com status=banned|permanently_banned
qr_emitted connection.diagnostic com step=qr_emitted
messages_sent messages com direction=outbound AND created_at >= window_start
messages_received messages com direction=inbound AND created_at >= window_start

Aggregate (fleet-wide):

Campo Significado
total_instances Total pos-filtro (antes de paginação).
by_status Map com as chaves CONNECTED, DISCONNECTED, QR_PENDING, CONNECTING, BANNED, TIMED_OUT, CREATED sempre presentes (0 quando vazio).
connectivity_rate connected / total_instances * 100 (arredondado 2 casas).
flap_detected_in_window Soma dos flap_detected em toda a frota.
stream_replaced_self_in_window Soma dos stream_replaced_self.
stream_replaced_genuine_in_window connection.update status=replaced menos o número de self-caused. Nunca negativo.
ban_in_window Bans observados na janela (via instance.banned + connection.update status=banned).
logged_out_in_window Logouts observados na janela.

Resposta 200 (exemplo abreviado):

json
{
  "window": "24h",
  "window_seconds": 86400,
  "total": 42,
  "returned": 10,
  "aggregate": {
    "total_instances": 42,
    "by_status": {"CONNECTED":38,"DISCONNECTED":2,"QR_PENDING":0,"CONNECTING":1,"BANNED":0,"TIMED_OUT":0,"CREATED":1},
    "connectivity_rate": 90.48,
    "flap_detected_in_window": 1,
    "stream_replaced_self_in_window": 0,
    "stream_replaced_genuine_in_window": 0,
    "ban_in_window": 0,
    "logged_out_in_window": 0
  },
  "instances": [
    {
      "id": "f35157b4-46ae-4bdb-8f51-4870bcb5bf9c",
      "company_id": 9,
      "company_name": "Catcher",
      "company_slug": "catcher-tenant-1",
      "name": "Catcher",
      "phone": "554137984905",
      "status": "CONNECTED",
      "desired_state": "CONNECTED",
      "tier": "healthy",
      "connected_at": "2026-04-23T13:22:22Z",
      "last_seen": "2026-04-23T13:22:22Z",
      "last_activity_at": "2026-04-23T13:27:11Z",
      "disconnected_at": null,
      "connected_duration_seconds": 1248,
      "proxy": {
        "id": 11,
        "ip_address": "200.239.204.138",
        "city": "br-saopaulo",
        "zone": "isp_br",
        "last_health_ok": true,
        "last_health_check": "2026-04-23T13:25:00Z"
      },
      "last_error": "",
      "ban_expiry": null,
      "counts_window": {
        "connect_start": 2,
        "connect_cooldown": 0,
        "connection_connected": 2,
        "connection_disconnected": 1,
        "connection_replaced": 0,
        "stream_replaced_self": 0,
        "flap_detected": 0,
        "websocket_closed": 1,
        "keepalive_restored": 0,
        "keepalive_timeout": 0,
        "logged_out": 0,
        "banned": 0,
        "qr_emitted": 0,
        "messages_sent": 127,
        "messages_received": 84
      },
      "last_event": {
        "type": "connection.update",
        "status_or_step": "connected",
        "at": "2026-04-23T13:22:22Z"
      }
    }
  ]
}

O campo tier usa a mesma função health.EvaluateTier da rota de /v1/admin/instance-health (healthy, stale, degraded, offline, critical_offline, intentional, pending, banned).

GET/v1/admin/connectivity/instances/{id}/timeline#

Timeline cronologica (DESC) de eventos de conexão para UMA instância. Usado pelo drawer de diagnostico e pelo reporter de incidentes do console de connectivity.

Query params:

Param Tipo Default Significado
window 15m|1h|6h|24h|7d 1h Janela do range de eventos.
types CSV de event types connection.diagnostic,connection.update,instance.offline,instance.recovered,instance.banned,instance.critical_offline Restringe o feed.
limit int 200 (max 1000) Paginação hard-capped.

Resposta 200:

json
{
  "instance": {
    "id": "f35157b4-46ae-4bdb-8f51-4870bcb5bf9c",
    "company_id": 9,
    "company_name": "Catcher",
    "name": "Catcher",
    "phone": "554137984905",
    "status": "CONNECTED",
    "desired_state": "CONNECTED",
    "tier": "healthy",
    "connected_at": "2026-04-23T13:22:22Z",
    "last_seen": "2026-04-23T13:22:22Z",
    "last_activity_at": "2026-04-23T13:27:11Z",
    "disconnected_at": null
  },
  "window": "1h",
  "window_seconds": 3600,
  "total": 12,
  "events": [
    {
      "event_id": "e5b4c57a-...",
      "type": "connection.diagnostic",
      "timestamp": "2026-04-23T13:22:18.526Z",
      "summary": "connect_start",
      "level": "info",
      "step": "connect_start",
      "message": "Iniciando conexao",
      "data": {"step":"connect_start","level":"info","message":"Iniciando conexao","timestamp":1713879738526}
    },
    {
      "event_id": "4fda3eec-...",
      "type": "connection.update",
      "timestamp": "2026-04-23T13:22:22.178Z",
      "summary": "connected",
      "level": "success",
      "data": {"status":"connected"}
    }
  ]
}

Derivacao de summary/level:

  • connection.diagnostic: summary=step, level vem da própria row.
  • connection.update: summary=status, level inferido (connected -> success; disconnected/replaced/banned/keepalive_timeout -> warning; logged_out/connect_failure/stream_error -> error).
  • instance.offline -> warning, instance.recovered -> success, instance.banned / instance.critical_offline -> error.

404 se o instanceId não existe em nenhum tenant DB ativo.