Enhancement: Add Dashboard Card For Scheduler, Worker, And Job Queue Status

by ADMIN 76 views

Background & Rationale

NetRaven's job execution relies heavily on RQ (Redis Queue), the scheduler, and worker containers. To ensure seamless operation and efficient management, users need real-time visibility into the status of these critical components. This enhancement aims to provide a comprehensive dashboard card that displays the status of workers, the scheduler, and the job queue, enabling users to quickly identify potential issues and take corrective action.

Implementation Plan

1. Backend

To provide a robust and scalable solution, we will implement a new API endpoint to fetch the required status information. This endpoint will return a JSON response containing the status of workers, the scheduler, and the job queue.

Add new API endpoint

We will create a new API endpoint, e.g., /system/job_status, which will return the following information:

  • Worker(s) status: The status of each worker, including their last heartbeat and count.
  • Scheduler status: The status of the scheduler, including whether it is active, the number of jobs scheduled, and the next run time.
  • RQ queue stats: The number of jobs in the queue, including those that are queued, running, failed, and completed.

Here's an example response:

{
  "workers": [
    {"id": "worker-1", "status": "online", "last_heartbeat": "2024-06-01T12:00:00Z"}
  ],
  "scheduler": {
    "status": "active",
    "scheduled_jobs": 5,
    "last_sync": "2024-06-01T12:00:00Z"
  },
  "queue": {
    "queued": 3,
    "running": 1,
    "failed": 0,
    "completed": 10
  }
}

Implementation Notes

To implement this endpoint, we will use RQ's built-in monitoring APIs for queue and worker status. For the scheduler status, we can infer the status from Redis keys or a periodic heartbeat.

2. Frontend

To provide a user-friendly interface, we will add a new "Job System Status" card to the dashboard. This card will display the status of workers, the scheduler, and the job queue in a clear and concise manner.

Add a new “Job System Status” card

The card will display the following information:

  • Worker(s) status: The status of each worker, including their count and last seen time.
  • Scheduler status: The status of the scheduler, including whether it is active and the number of jobs scheduled.
  • Job queue stats: The number of jobs in the queue, including those that are queued, running, failed, and completed.

Here's an example of how the card might look:

[🟢] Worker(s): 1 online (last seen: 2s ago)
[🟢] Scheduler: Active (5 jobs scheduled)
[🟡] Queue: 3 queued, 1 running, 0 failed, 10 completed

UX Recommendations

To make the card more user-friendly, we can use icons and color codes for quick status recognition. We can also show tooltips or details for errors warnings. Additionally, we can allow navigation to a detailed job/worker status page for further investigation.

3. Example UI

Here's an example of how the dashboard card might look:

[🟢] Worker(s): 1 online (last seen: 2s ago)
[🟢] Scheduler: Active (5 jobs scheduled)
[🟡] Queue: 3 queued, 1 running, 0 failed, 10 completed

4. Acceptance Criteria

To ensure that the implementation meets the requirements, we will define the following acceptance criteria:

  • The dashboard displays a new card with real-time status for workers, scheduler, and job queue.
  • The status is updated automatically without page reload.
  • If a component is unhealthy or jobs are failing, a clear warning is shown.
  • The implementation is documented and tested.

5. References

For more information on RQ monitoring and best practices for job queue dashboards, please refer to the following resources:

Frequently Asked Questions

Q: What is the purpose of this enhancement?

A: The purpose of this enhancement is to provide a comprehensive dashboard card that displays the status of workers, the scheduler, and the job queue, enabling users to quickly identify potential issues and take corrective action.

Q: What information will be displayed on the dashboard card?

A: The dashboard card will display the following information:

  • Worker(s) status: The status of each worker, including their count and last seen time.
  • Scheduler status: The status of the scheduler, including whether it is active and the number of jobs scheduled.
  • Job queue stats: The number of jobs in the queue, including those that are queued, running, failed, and completed.

Q: How will the status be updated in real-time?

A: The status will be updated in real-time using RQ's built-in monitoring APIs for queue and worker status. For the scheduler status, we can infer the status from Redis keys or a periodic heartbeat.

Q: What UX recommendations will be implemented?

A: To make the card more user-friendly, we will use icons and color codes for quick status recognition. We can also show tooltips or details for errors warnings. Additionally, we can allow navigation to a detailed job/worker status page for further investigation.

Q: What are the acceptance criteria for this enhancement?

A: The acceptance criteria for this enhancement are:

  • The dashboard displays a new card with real-time status for workers, scheduler, and job queue.
  • The status is updated automatically without page reload.
  • If a component is unhealthy or jobs are failing, a clear warning is shown.
  • The implementation is documented and tested.

Q: What resources will be used for this enhancement?

A: For more information on RQ monitoring and best practices for job queue dashboards, please refer to the following resources:

Q: What are the benefits of this enhancement?

A: The benefits of this enhancement are:

  • Improved visibility into the status of workers, scheduler, and job queue.
  • Faster identification of potential issues and corrective action.
  • Enhanced user experience through real-time updates and clear warnings.

Q: What is the timeline for this enhancement?

A: The timeline for this enhancement will be determined based on the complexity of the implementation and the availability of resources. However, we anticipate that the enhancement will be completed within a few weeks.

Q: Who will be responsible for implementing this enhancement?

A: The implementation of this enhancement will be led by the development team, with input and guidance from the product management and UX teams.

Q: How will the enhancement be tested and validated?

A: The enhancement will be tested and validated through a combination of unit testing, integration testing, and user acceptance testing (UAT). The testing and validation process will be led by the quality assurance team.