8 Commits

Author SHA1 Message Date
mike 9f805c7e07 net: monitor and report connectivity changes
build-windows / build-hello-agent-x64 (push) Successful in 6m40s
build-windows / sign-hello-agent-x64 (push) Successful in 6s
build-windows / validate-hello-agent-x64 (push) Successful in 9s
Add a network-change monitor that polls public IP, per-interface LAN IPv4
(every adapter), and Wi-Fi SSID/BSSID once a minute, diffs against a
snapshot persisted in LocalConfig, and POSTs only the changes (old -> new)
to /api/agent/network-events. Restart-safe (no re-emit) with the same
queue/retry/backoff as the perf + login-event monitors. Powers the opsbase
device Network-history tab and the fleet network log.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 13:15:40 +00:00
mike 69ec496a62 perf: report top-5 processes by CPU and memory
build-windows / build-hello-agent-x64 (push) Successful in 6m35s
build-windows / sign-hello-agent-x64 (push) Successful in 5s
build-windows / validate-hello-agent-x64 (push) Successful in 9s
Collect every named process per sample, then send the top-5 by CPU and the
top-5 by memory as a `top_processes` object alongside the existing single
top-CPU/top-memory scalars (kept for backward compatibility). opsbase shows
these as 5-row lists in the device's Live performance card.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 12:21:48 +00:00
mike b5e8c00d92 inventory: report device class (form factor)
build-windows / build-hello-agent-x64 (push) Successful in 6m43s
build-windows / sign-hello-agent-x64 (push) Successful in 6s
build-windows / validate-hello-agent-x64 (push) Successful in 10s
Derive a coarse device_class from SMBIOS chassis type + PCSystemType, with
VM detection (manufacturer/model) and Windows Server OS taking precedence:
Laptop / Desktop / Server / Tablet / Virtual Machine / Unknown. Surfaced in
opsbase on the device list and Hardware tab.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 11:41:54 +00:00
mike 95e51842ae release: bump to 0.1.8 (fix invalid edition)
build-windows / build-hello-agent-x64 (push) Successful in 6m19s
build-windows / sign-hello-agent-x64 (push) Successful in 5s
build-windows / validate-hello-agent-x64 (push) Successful in 9s
Bump version for the authoritative-strategy + session-notification changes.
edition was set to an invalid "2026"; restore to 2021 (rust-version 1.75
predates edition 2024).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 10:59:09 +02:00
mike faf3b1303b strategy: make pushed config authoritative
handle_config_options only ever inserted/overwrote the keys a strategy
sent, so removing a key (or unassigning a strategy) on the server left the
old value lingering on the device.

Persist the set of keys applied on the previous push (strategy_managed_keys
in LocalConfig) and, on each apply, reset any key the server managed before
but no longer sends back to its default. Keys the server never managed (the
user's own local settings) are left untouched, so this remains a managed
overlay rather than a wipe.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 08:54:12 +00:00
mike 31d5a881b6 cm: only notify the user once all sessions have closed
remove_connection fired the "session ended" banner for every connection
that closed. With multiple concurrent sessions (a second supporter, or a
file-transfer session running alongside remote control) the user saw one
banner per close even though support was still ongoing.

Track the remaining approved-session count under the same lock and show the
banner only when the last session has ended. Denied/never-approved
connections still produce no banner.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 08:54:12 +00:00
mike 14411987e7 sysinfo: advertise Ed25519 public key for opsbase TOFU enrollment
build-windows / build-hello-agent-x64 (push) Successful in 5m19s
build-windows / sign-hello-agent-x64 (push) Successful in 8s
build-windows / validate-hello-agent-x64 (push) Successful in 11s
Include the agent's base64 Ed25519 public key in the `pk` field of the
sysinfo upload. opsbase (acting as the agent's api-server) has no rendezvous
server to learn the key from, so it pins this key trust-on-first-use on first
contact and verifies every later signed request against it.

This is the same keypair sign.rs already signs requests with. Vanilla
rustdesk servers ignore the unknown field, so the change is backward
compatible.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 19:23:58 +00:00
mike 8de2ebea85 Implement performance monitor
build-windows / build-hello-agent-x64 (push) Successful in 5m0s
build-windows / sign-hello-agent-x64 (push) Successful in 5s
build-windows / validate-hello-agent-x64 (push) Successful in 7s
2026-05-22 21:52:13 +02:00
10 changed files with 1325 additions and 4 deletions
Generated
+2 -1
View File
@@ -3197,13 +3197,14 @@ checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea"
[[package]] [[package]]
name = "hello-agent" name = "hello-agent"
version = "0.1.6" version = "0.1.8"
dependencies = [ dependencies = [
"anyhow", "anyhow",
"env_logger 0.10.2", "env_logger 0.10.2",
"hbb_common", "hbb_common",
"log", "log",
"rustdesk", "rustdesk",
"serde 1.0.228",
"serde_json 1.0.118", "serde_json 1.0.118",
"tokio", "tokio",
"winapi", "winapi",
+7 -1
View File
@@ -1,6 +1,6 @@
[package] [package]
name = "hello-agent" name = "hello-agent"
version = "0.1.6" version = "0.1.8"
edition = "2021" edition = "2021"
rust-version = "1.75" rust-version = "1.75"
description = "Headless RustDesk-protocol-compatible support agent for Windows" description = "Headless RustDesk-protocol-compatible support agent for Windows"
@@ -34,6 +34,12 @@ anyhow = "1"
# the inventory module's `use serde_json` doesn't depend on internal # the inventory module's `use serde_json` doesn't depend on internal
# implementation details of hbb_common. # implementation details of hbb_common.
serde_json = "1" serde_json = "1"
# `perf_events.rs` derives Deserialize on the PowerShell row schema.
# hbb_common re-exports `serde_derive` and `serde_json` but NOT `serde`
# itself — and `#[derive(Deserialize)]` expands to a path that references
# the `serde` crate root, so we depend on it explicitly with the `derive`
# feature.
serde = { version = "1", features = ["derive"] }
[target.'cfg(target_os = "windows")'.dependencies] [target.'cfg(target_os = "windows")'.dependencies]
windows-service = "0.6" windows-service = "0.6"
+18 -1
View File
@@ -140,9 +140,26 @@ impl InvokeUiCM for HeadlessCm {
fn remove_connection(&self, id: i32, _close: bool) { fn remove_connection(&self, id: i32, _close: bool) {
trace(&format!("remove_connection: id={id}")); trace(&format!("remove_connection: id={id}"));
let entry = self.approved.lock().unwrap().remove(&id); // Remove this session and read how many approved sessions remain, while
// holding the lock so the check is atomic against concurrent add/remove.
let (entry, remaining) = {
let mut approved = self.approved.lock().unwrap();
let entry = approved.remove(&id);
(entry, approved.len())
};
if let Some((peer_id, name)) = entry { if let Some((peer_id, name)) = entry {
// Only notify once EVERY approved session has ended. With multiple
// concurrent sessions (a second supporter, or a file-transfer
// session running alongside remote control) the user shouldn't get
// a "session ended" banner while other sessions are still open —
// they'd see one banner per close even though support is ongoing.
if remaining == 0 {
std::thread::spawn(move || show_session_ended(&peer_id, &name)); std::thread::spawn(move || show_session_ended(&peer_id, &name));
} else {
trace(&format!(
"remove_connection: {remaining} approved session(s) still open; suppressing banner"
));
}
} }
} }
+32
View File
@@ -154,10 +154,42 @@ $installed_software = @($installed_software | Sort-Object name, version)
$os_release = "$($os.Version)" $os_release = "$($os.Version)"
if ($displayVersion) { $os_release = "$($os.Version) $displayVersion" } if ($displayVersion) { $os_release = "$($os.Version) $displayVersion" }
# Device class — a coarse form factor for asset grouping. VM first (most
# specific), then Windows Server OS, then the SMBIOS chassis type, then the
# computer-system PCSystemType as a fallback.
$enc = Get-CimInstance -ClassName Win32_SystemEnclosure -ErrorAction SilentlyContinue | Select-Object -First 1
$chassis = @(); if ($enc) { $chassis = @($enc.ChassisTypes) }
$man = "$($cs.Manufacturer)".ToLower()
$mod = "$($cs.Model)".ToLower()
$device_class = 'Unknown'
if ($man -match 'vmware|virtualbox|innotek|qemu|xen|parallels|kvm' -or $mod -match 'virtual machine|vmware|virtualbox|kvm|qemu') {
$device_class = 'Virtual Machine'
} elseif ("$($os.Caption)" -match 'Server') {
$device_class = 'Server'
} else {
foreach ($ct in $chassis) {
if ($ct -in 8,9,10,14) { $device_class = 'Laptop'; break }
elseif ($ct -in 30,31,32) { $device_class = 'Tablet'; break }
elseif ($ct -in 17,23,28,29) { $device_class = 'Server'; break }
elseif ($ct -in 3,4,5,6,7,13,15,16,24) { $device_class = 'Desktop'; break }
}
if ($device_class -eq 'Unknown') {
switch ([int]$cs.PCSystemType) {
1 { $device_class = 'Desktop' }
2 { $device_class = 'Laptop' }
3 { $device_class = 'Desktop' }
4 { $device_class = 'Server' }
5 { $device_class = 'Server' }
}
}
}
$result = [pscustomobject]@{ $result = [pscustomobject]@{
serial_number = $bios.SerialNumber serial_number = $bios.SerialNumber
manufacturer = $cs.Manufacturer manufacturer = $cs.Manufacturer
model = $cs.Model model = $cs.Model
device_class = $device_class
domain = $cs.Domain domain = $cs.Domain
os_distro = $os.Caption os_distro = $os.Caption
os_release = $os_release os_release = $os_release
+4
View File
@@ -31,6 +31,10 @@ mod cm_popup;
mod exec; mod exec;
mod login_events; mod login_events;
#[cfg(target_os = "windows")] #[cfg(target_os = "windows")]
mod net_monitor;
mod perf;
mod perf_events;
#[cfg(target_os = "windows")]
mod service; mod service;
#[cfg(target_os = "windows")] #[cfg(target_os = "windows")]
mod unattended_password; mod unattended_password;
+358
View File
@@ -0,0 +1,358 @@
// Network-change monitor. Polls the device's connectivity state — public IP,
// per-interface LAN IPv4, and the current Wi-Fi SSID/BSSID — diffs it against
// the last-known snapshot, and POSTs only the *changes* to
// `<api-server>/api/agent/network-events`.
//
// The snapshot is persisted in LocalConfig so an agent restart doesn't re-emit
// the whole state as "changes"; only a genuine change since the last poll
// produces events. The very first run (no persisted snapshot) records the
// initial state with an empty `old` value so the history has a starting point.
//
// Mirrors `perf.rs`: a dedicated thread with a current-thread tokio runtime, an
// in-memory queue, and signed POSTs with capped backoff + bounded retry.
#![cfg(target_os = "windows")]
use anyhow::{anyhow, Result};
use hbb_common::config::{Config, LocalConfig};
use hbb_common::serde_json::{self, json, Value};
use std::collections::BTreeMap;
use std::sync::Mutex;
use std::time::Duration;
/// Poll cadence for the cheap local state (LAN + Wi-Fi).
const SAMPLE_INTERVAL: Duration = Duration::from_secs(60);
/// Refresh the (external) public IP only every Nth poll to limit egress.
const PUBLIC_IP_EVERY: u32 = 5; // ~5 minutes
const FLUSH_BASE: Duration = Duration::from_secs(15);
const FLUSH_MAX: Duration = Duration::from_secs(15 * 60);
const MAX_EVENTS_PER_POST: usize = 256;
const DROP_AFTER: u32 = 300;
const STATE_KEY: &str = "net_monitor_state";
const CREATE_NO_WINDOW: u32 = 0x0800_0000;
#[derive(Clone, Default, Debug)]
struct PendingEvent {
at: i64,
kind: String,
iface: String,
old: String,
new: String,
attempts: u32,
}
static QUEUE: Mutex<Vec<PendingEvent>> = Mutex::new(Vec::new());
/// Current connectivity snapshot. Persisted to LocalConfig as JSON.
#[derive(Clone, Default)]
struct Snapshot {
public_ip: String,
lan: BTreeMap<String, Vec<String>>,
ssid: String,
bssid: String,
}
impl Snapshot {
fn to_value(&self) -> Value {
let lan: serde_json::Map<String, Value> = self
.lan
.iter()
.map(|(k, v)| (k.clone(), json!(v)))
.collect();
json!({
"public_ip": self.public_ip,
"lan": Value::Object(lan),
"ssid": self.ssid,
"bssid": self.bssid,
})
}
fn from_value(v: &Value) -> Snapshot {
let mut lan = BTreeMap::new();
if let Some(obj) = v.get("lan").and_then(|x| x.as_object()) {
for (k, val) in obj {
let ips: Vec<String> = val
.as_array()
.map(|a| a.iter().filter_map(|x| x.as_str().map(String::from)).collect())
.unwrap_or_default();
lan.insert(k.clone(), ips);
}
}
Snapshot {
public_ip: v.get("public_ip").and_then(|x| x.as_str()).unwrap_or("").to_string(),
lan,
ssid: v.get("ssid").and_then(|x| x.as_str()).unwrap_or("").to_string(),
bssid: v.get("bssid").and_then(|x| x.as_str()).unwrap_or("").to_string(),
}
}
}
pub fn start() {
use std::sync::atomic::{AtomicBool, Ordering};
static STARTED: AtomicBool = AtomicBool::new(false);
if STARTED.swap(true, Ordering::SeqCst) {
return;
}
std::thread::spawn(move || {
let rt = match tokio::runtime::Builder::new_current_thread().enable_all().build() {
Ok(rt) => rt,
Err(e) => {
hbb_common::log::warn!("net_monitor: build runtime: {e}");
return;
}
};
rt.block_on(run_loop());
});
}
async fn run_loop() {
// Load the last-known snapshot so a restart doesn't re-emit everything.
let mut prev: Snapshot = serde_json::from_str::<Value>(&LocalConfig::get_option(STATE_KEY))
.ok()
.map(|v| Snapshot::from_value(&v))
.unwrap_or_default();
// On the very first run there's no persisted state; emit the initial state.
let mut first_run = LocalConfig::get_option(STATE_KEY).is_empty();
let mut tick: u32 = 0;
let mut flush_backoff = FLUSH_BASE;
loop {
let refresh_public = first_run || tick % PUBLIC_IP_EVERY == 0;
let cur = collect(&prev, refresh_public);
let at = hbb_common::chrono::Utc::now().timestamp();
let events = diff(&prev, &cur, at);
if !events.is_empty() {
// Skip the empty→empty noise on a brand-new machine with no data.
let mut q = QUEUE.lock().unwrap();
q.extend(events);
}
// Advance + persist the baseline regardless of whether the flush
// succeeds, so a network outage doesn't make us re-emit on recovery.
prev = cur;
LocalConfig::set_option(STATE_KEY.to_string(), prev.to_value().to_string());
first_run = false;
match flush_once().await {
FlushOutcome::Idle | FlushOutcome::AllSent => flush_backoff = FLUSH_BASE,
FlushOutcome::Failed => flush_backoff = (flush_backoff * 2).min(FLUSH_MAX),
}
tick = tick.wrapping_add(1);
tokio::time::sleep(SAMPLE_INTERVAL.min(flush_backoff)).await;
}
}
/// Build the current snapshot. `prev` supplies the carried-over public IP when
/// we're not refreshing it this tick.
fn collect(prev: &Snapshot, refresh_public: bool) -> Snapshot {
let lan = collect_lan();
let (ssid, bssid) = collect_wifi();
let public_ip = if refresh_public {
collect_public_ip().unwrap_or_else(|| prev.public_ip.clone())
} else {
prev.public_ip.clone()
};
Snapshot { public_ip, lan, ssid, bssid }
}
/// Per-interface IPv4 addresses (every adapter, loopback excluded), sorted.
fn collect_lan() -> BTreeMap<String, Vec<String>> {
const PS: &str = r#"
$rows = Get-NetIPAddress -AddressFamily IPv4 -ErrorAction SilentlyContinue |
Where-Object { $_.IPAddress -notlike '127.*' } |
Group-Object InterfaceAlias |
ForEach-Object { [pscustomobject]@{ iface = $_.Name; ips = @($_.Group.IPAddress) } }
ConvertTo-Json -Compress -Depth 4 -InputObject @($rows)
"#;
let mut map = BTreeMap::new();
let Some(out) = run_ps(PS) else { return map };
let Ok(v) = serde_json::from_str::<Value>(out.trim()) else { return map };
if let Some(arr) = v.as_array() {
for row in arr {
let iface = row.get("iface").and_then(|x| x.as_str()).unwrap_or("").to_string();
if iface.is_empty() {
continue;
}
let mut ips: Vec<String> = row
.get("ips")
.and_then(|x| x.as_array())
.map(|a| a.iter().filter_map(|x| x.as_str().map(String::from)).collect())
.unwrap_or_default();
ips.sort();
ips.dedup();
map.insert(iface, ips);
}
}
map
}
fn collect_public_ip() -> Option<String> {
const PS: &str =
"try { (Invoke-RestMethod -Uri 'https://api.ipify.org' -TimeoutSec 5).Trim() } catch { '' }";
let out = run_ps(PS)?;
let ip = out.trim().to_string();
if ip.is_empty() {
None
} else {
Some(ip)
}
}
fn collect_wifi() -> (String, String) {
let (current, _nearby) = crate::wifi_native::collect();
let get = |k: &str| {
current
.as_ref()
.and_then(|c| c.get(k))
.and_then(|v| v.as_str())
.unwrap_or("")
.to_string()
};
(get("ssid"), get("bssid"))
}
fn run_ps(script: &str) -> Option<String> {
use std::os::windows::process::CommandExt;
use std::process::Command;
let out = Command::new("powershell.exe")
.args([
"-NoProfile",
"-NonInteractive",
"-ExecutionPolicy",
"Bypass",
"-Command",
script,
])
.creation_flags(CREATE_NO_WINDOW)
.output()
.ok()?;
if !out.status.success() {
return None;
}
Some(String::from_utf8_lossy(&out.stdout).into_owned())
}
fn ev(at: i64, kind: &str, iface: &str, old: &str, new: &str) -> PendingEvent {
PendingEvent {
at,
kind: kind.to_string(),
iface: iface.to_string(),
old: old.to_string(),
new: new.to_string(),
attempts: 0,
}
}
fn diff(prev: &Snapshot, cur: &Snapshot, at: i64) -> Vec<PendingEvent> {
let mut events = Vec::new();
if prev.public_ip != cur.public_ip && !(prev.public_ip.is_empty() && cur.public_ip.is_empty()) {
events.push(ev(at, "public_ip", "", &prev.public_ip, &cur.public_ip));
}
if prev.ssid != cur.ssid && !(prev.ssid.is_empty() && cur.ssid.is_empty()) {
events.push(ev(at, "wifi_ssid", "", &prev.ssid, &cur.ssid));
}
if prev.bssid != cur.bssid && !(prev.bssid.is_empty() && cur.bssid.is_empty()) {
events.push(ev(at, "wifi_bssid", "", &prev.bssid, &cur.bssid));
}
// LAN IPv4 per interface, over the union of interface names.
let mut ifaces: Vec<&String> = prev.lan.keys().chain(cur.lan.keys()).collect();
ifaces.sort();
ifaces.dedup();
for iface in ifaces {
let old = prev.lan.get(iface).map(|v| v.join(", ")).unwrap_or_default();
let new = cur.lan.get(iface).map(|v| v.join(", ")).unwrap_or_default();
if old != new {
events.push(ev(at, "lan_ipv4", iface, &old, &new));
}
}
events
}
enum FlushOutcome {
Idle,
AllSent,
Failed,
}
async fn flush_once() -> FlushOutcome {
let batch: Vec<PendingEvent> = {
let mut q = QUEUE.lock().unwrap();
if q.is_empty() {
return FlushOutcome::Idle;
}
let take = q.len().min(MAX_EVENTS_PER_POST);
q.drain(..take).collect()
};
match post_batch(&batch).await {
Ok(()) => FlushOutcome::AllSent,
Err(e) => {
hbb_common::log::warn!("net_monitor: flush of {} event(s) failed: {e:#}", batch.len());
let mut requeued: Vec<PendingEvent> = batch
.into_iter()
.filter_map(|mut s| {
s.attempts = s.attempts.saturating_add(1);
if s.attempts >= DROP_AFTER {
None
} else {
Some(s)
}
})
.collect();
let mut q = QUEUE.lock().unwrap();
let tail: Vec<PendingEvent> = q.drain(..).collect();
requeued.extend(tail);
*q = requeued;
FlushOutcome::Failed
}
}
}
async fn post_batch(batch: &[PendingEvent]) -> Result<()> {
let api = librustdesk::common::get_api_server(
Config::get_option("api-server"),
Config::get_option("custom-rendezvous-server"),
);
if api.is_empty() {
return Err(anyhow!("no api-server configured yet"));
}
let url = format!("{api}/api/agent/network-events");
let id = Config::get_id();
let uuid = librustdesk::common::encode64(hbb_common::get_uuid());
let events: Vec<Value> = batch
.iter()
.map(|e| {
json!({
"at": e.at,
"kind": e.kind,
"iface": e.iface,
"old": e.old,
"new": e.new,
})
})
.collect();
let body = json!({ "id": id, "uuid": uuid, "events": events }).to_string();
let headers = librustdesk::hbbs_http::sign::build_signed_headers(
"POST",
"/api/agent/network-events",
body.as_bytes(),
)
.unwrap_or_default();
let resp = librustdesk::common::post_request(url, body, &headers)
.await
.map_err(|e| anyhow!("post: {e}"))?;
let trimmed = resp.trim();
if trimmed == "OK" || trimmed == "ID_NOT_FOUND" {
Ok(())
} else {
Err(anyhow!("unexpected response: {trimmed}"))
}
}
+339
View File
@@ -0,0 +1,339 @@
// Continuous performance sampling.
//
// One sample per minute: overall CPU%, memory used / total, top
// process by CPU, top process by memory, uptime, process count.
// Posted in batches to /api/agent/metrics; surfaced on the admin
// device detail page as a 24 h sparkline plus a "live" snapshot card.
//
// Architecture mirrors `login_events`:
// * One background thread, its own current-thread Tokio runtime.
// * Sysinfo crate (vendored under hbb_common) does the cross-cutting
// work — we keep one `System` instance alive across iterations so
// its per-process CPU% accounting (which is differential against
// the previous refresh) stays accurate.
// * In-memory queue with retry-with-backoff and a hard drop cap so
// a permanently-misconfigured agent can't balloon memory.
// * Server-side `INSERT OR IGNORE` keyed on (peer_id, at) dedups any
// retries that get there twice.
//
// Process CPU% on Windows is reported by sysinfo as "% of one core",
// so a busy multi-threaded process can read >100%. We normalise by
// `cpus().len()` so the snapshot card's "top CPU: chrome.exe 18%"
// is comparable to the overall CPU%. Memory is in bytes from sysinfo;
// we convert to MB on the wire.
use anyhow::{anyhow, Result};
use std::sync::Mutex;
use std::time::Duration;
/// Sampling cadence. 60 s strikes a balance between resolution (enough
/// granularity to spot a 2-minute CPU spike) and storage (~1440 rows /
/// device / day on the server side).
const SAMPLE_INTERVAL: Duration = Duration::from_secs(60);
/// Flush cadence on the happy path. The reporter tries to flush after
/// every sample anyway; this is the floor used by the network-error
/// backoff before it doubles.
const FLUSH_INTERVAL_BASE: Duration = Duration::from_secs(60);
const FLUSH_INTERVAL_MAX: Duration = Duration::from_secs(15 * 60);
/// At backoff cap = 15 min, this is ~3 days of trying — enough to ride
/// out a long server-side outage. Beyond that we drop, on the same
/// reasoning as `login_events::DROP_AFTER`.
const DROP_AFTER: u32 = 300;
/// Must match the server's MAX_SAMPLES_PER_POST.
const MAX_SAMPLES_PER_POST: usize = 512;
#[derive(Clone, Debug, Default)]
struct PendingSample {
at: i64,
cpu_pct: f64,
mem_used_mb: i64,
mem_total_mb: i64,
proc_count: i64,
uptime_secs: i64,
top_cpu_name: String,
top_cpu_pct: f64,
top_mem_name: String,
top_mem_mb: i64,
/// Top-5 processes by CPU (name, normalised %) and by memory (name, MB).
top_cpu_procs: Vec<(String, f64)>,
top_mem_procs: Vec<(String, i64)>,
attempts: u32,
}
#[cfg(target_os = "windows")]
static QUEUE: Mutex<Vec<PendingSample>> = Mutex::new(Vec::new());
/// Kick off the metrics sampler. Safe to call multiple times — guarded
/// by an `AtomicBool` so a stray second call is a no-op.
pub fn start() {
#[cfg(not(target_os = "windows"))]
{
// Cross-platform stub; the implementation only runs on Windows
// because that's the only OS hello-agent ships on. Sysinfo is
// cross-platform, so a future Linux build can drop the cfg-gate
// and reuse this module unchanged.
}
#[cfg(target_os = "windows")]
{
use std::sync::atomic::{AtomicBool, Ordering};
static STARTED: AtomicBool = AtomicBool::new(false);
if STARTED.swap(true, Ordering::SeqCst) {
return;
}
std::thread::spawn(move || {
let rt = match tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
{
Ok(rt) => rt,
Err(e) => {
log::warn!("perf: build runtime: {e}");
return;
}
};
rt.block_on(run_loop());
});
}
}
#[cfg(target_os = "windows")]
async fn run_loop() {
use hbb_common::sysinfo::System;
// One System instance, refreshed across iterations. Sysinfo's
// per-process CPU% is computed against the previous refresh, so a
// throwaway `System::new()` each tick would always read 0%.
let mut sys = System::new();
// Prime CPU + processes once so the first real sample isn't NaN.
// The MINIMUM_CPU_UPDATE_INTERVAL sleep is what sysinfo's docs
// recommend between back-to-back CPU refreshes; here we have the
// full SAMPLE_INTERVAL coming up so we just refresh-and-wait.
sys.refresh_cpu();
sys.refresh_processes();
tokio::time::sleep(SAMPLE_INTERVAL).await;
let mut flush_backoff = FLUSH_INTERVAL_BASE;
loop {
if let Some(sample) = collect_sample(&mut sys) {
QUEUE.lock().unwrap().push(sample);
}
match flush_once().await {
FlushOutcome::Idle | FlushOutcome::AllSent => {
flush_backoff = FLUSH_INTERVAL_BASE;
}
FlushOutcome::Failed => {
flush_backoff = (flush_backoff * 2).min(FLUSH_INTERVAL_MAX);
}
}
// We always sleep SAMPLE_INTERVAL — the network backoff slows
// the *retry* of pending events, not observation. Sampling
// continues at full cadence so a transient outage doesn't punch
// a hole in the chart on either side of the dropped window.
tokio::time::sleep(SAMPLE_INTERVAL.min(flush_backoff)).await;
}
}
#[cfg(target_os = "windows")]
fn collect_sample(sys: &mut hbb_common::sysinfo::System) -> Option<PendingSample> {
sys.refresh_cpu();
sys.refresh_memory();
sys.refresh_processes();
let at = hbb_common::chrono::Utc::now().timestamp();
let cpu_pct = sys.global_cpu_info().cpu_usage() as f64;
// Sysinfo returns bytes on every platform; the server schema stores
// MB to keep row sizes small at scale. Divide-then-cast bounds
// arithmetic to u64 territory.
let mem_total_mb = (sys.total_memory() / 1024 / 1024) as i64;
let mem_used_mb = (sys.used_memory() / 1024 / 1024) as i64;
let uptime_secs = sys.uptime() as i64;
// Per-process CPU% from sysinfo is normalised per core: a 4-thread
// process pinning 4 cores on a 4-core machine reads 400%. Divide
// by core count so the snapshot card's number matches the overall
// CPU% reading (both 0-100 of the whole machine).
let cpu_count = sys.cpus().len().max(1) as f64;
// Collect every (named) process once, then derive the top-5 by CPU and by
// memory. The single top_cpu_*/top_mem_* scalars are kept for backward
// compatibility (older servers ignore top_processes).
let mut procs: Vec<(&str, f64, i64)> = Vec::new();
let mut proc_count = 0i64;
for proc in sys.processes().values() {
proc_count += 1;
let name = proc.name();
// Some kernel-side rows show up with empty names on Windows;
// skip them so we don't ever render a top row with no label.
if name.is_empty() {
continue;
}
let cu = proc.cpu_usage();
let cu = if cu.is_finite() { (cu as f64 / cpu_count).min(100.0) } else { 0.0 };
let mb = (proc.memory() / 1024 / 1024) as i64;
procs.push((name, cu, mb));
}
let mut by_cpu = procs.clone();
by_cpu.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap_or(std::cmp::Ordering::Equal));
let top_cpu_procs: Vec<(String, f64)> = by_cpu
.iter()
.take(5)
.map(|(n, c, _)| (n.to_string(), *c))
.collect();
let mut by_mem = procs;
by_mem.sort_by(|a, b| b.2.cmp(&a.2));
let top_mem_procs: Vec<(String, i64)> = by_mem
.iter()
.take(5)
.map(|(n, _, m)| (n.to_string(), *m))
.collect();
let (top_cpu_name, top_cpu_pct) = top_cpu_procs
.first()
.cloned()
.unwrap_or_default();
let (top_mem_name, top_mem_mb) = top_mem_procs
.first()
.cloned()
.unwrap_or_default();
Some(PendingSample {
at,
cpu_pct,
mem_used_mb,
mem_total_mb,
proc_count,
uptime_secs,
top_cpu_name,
top_cpu_pct,
top_mem_name,
top_mem_mb,
top_cpu_procs,
top_mem_procs,
attempts: 0,
})
}
#[cfg(target_os = "windows")]
enum FlushOutcome {
Idle,
AllSent,
Failed,
}
#[cfg(target_os = "windows")]
async fn flush_once() -> FlushOutcome {
let batch: Vec<PendingSample> = {
let mut q = QUEUE.lock().unwrap();
if q.is_empty() {
return FlushOutcome::Idle;
}
let take = q.len().min(MAX_SAMPLES_PER_POST);
q.drain(..take).collect()
};
match post_batch(&batch).await {
Ok(()) => FlushOutcome::AllSent,
Err(e) => {
log::warn!(
"perf: flush of {} sample(s) failed: {e:#}",
batch.len(),
);
let mut requeued: Vec<PendingSample> = batch
.into_iter()
.filter_map(|mut s| {
s.attempts = s.attempts.saturating_add(1);
if s.attempts >= DROP_AFTER {
log::warn!(
"perf: dropping sample after {} attempts: at={}",
s.attempts, s.at,
);
None
} else {
Some(s)
}
})
.collect();
let mut q = QUEUE.lock().unwrap();
let tail: Vec<PendingSample> = q.drain(..).collect();
requeued.extend(tail);
*q = requeued;
FlushOutcome::Failed
}
}
}
#[cfg(target_os = "windows")]
async fn post_batch(batch: &[PendingSample]) -> Result<()> {
let api = librustdesk::common::get_api_server(
hbb_common::config::Config::get_option("api-server"),
hbb_common::config::Config::get_option("custom-rendezvous-server"),
);
if api.is_empty() {
return Err(anyhow!("no api-server configured yet"));
}
let url = format!("{api}/api/agent/metrics");
let id = hbb_common::config::Config::get_id();
let uuid = librustdesk::common::encode64(hbb_common::get_uuid());
let samples: Vec<hbb_common::serde_json::Value> = batch
.iter()
.map(|s| {
hbb_common::serde_json::json!({
"at": s.at,
"cpu_pct": s.cpu_pct,
"mem_used_mb": s.mem_used_mb,
"mem_total_mb": s.mem_total_mb,
"proc_count": s.proc_count,
"uptime_secs": s.uptime_secs,
"top_cpu_name": s.top_cpu_name,
"top_cpu_pct": s.top_cpu_pct,
"top_mem_name": s.top_mem_name,
"top_mem_mb": s.top_mem_mb,
"top_processes": {
"cpu": s.top_cpu_procs.iter()
.map(|(n, p)| hbb_common::serde_json::json!({"name": n, "pct": p}))
.collect::<Vec<_>>(),
"mem": s.top_mem_procs.iter()
.map(|(n, m)| hbb_common::serde_json::json!({"name": n, "mb": m}))
.collect::<Vec<_>>(),
},
})
})
.collect();
let body = hbb_common::serde_json::json!({
"id": id,
"uuid": uuid,
"samples": samples,
})
.to_string();
let headers = librustdesk::hbbs_http::sign::build_signed_headers(
"POST",
"/api/agent/metrics",
body.as_bytes(),
)
.unwrap_or_default();
let resp = librustdesk::common::post_request(url, body, &headers)
.await
.map_err(|e| anyhow!("post: {e}"))?;
let trimmed = resp.trim();
if trimmed == "OK" || trimmed == "ID_NOT_FOUND" {
// ID_NOT_FOUND mirrors the unattended_password / login_events
// contract: server doesn't know the peer yet (rendezvous race
// on first boot). We drop the batch rather than retry forever;
// the next sample lands once /api/heartbeat has created the
// peer row.
Ok(())
} else {
Err(anyhow!("unexpected response: {trimmed}"))
}
}
+507
View File
@@ -0,0 +1,507 @@
// Windows event-log scraper for performance-related events.
//
// Pulls fresh entries from three channels Microsoft itself uses to
// flag "the OS noticed this machine was slow":
//
// * `Microsoft-Windows-Diagnostics-Performance/Operational` — boot,
// shutdown, standby, resume degradation (Event IDs 100/200/300/400
// family), each carrying total time + the component that caused it.
// * `Microsoft-Windows-Resource-Exhaustion-Detector/Operational` —
// Event ID 2004 fires when virtual memory hits the wall, with the
// top processes by working set.
// * `System` — IDs 41 (Kernel-Power unexpected reboot), 6008 (dirty
// shutdown), 1001 (BugCheck / BSOD).
//
// Cadence is intentionally low (5 min): these events are sparse — a
// healthy machine may produce zero per day, a sick one a handful.
//
// State machine per channel: a numeric RecordId cursor persisted in
// the agent config (`Config::set_option`). First run with no cursor
// pulls the trailing 7 days bounded by `MAX_FIRST_RUN_EVENTS`; every
// subsequent run pulls everything with `EventRecordID > cursor`. The
// cursor advances on observation, not on successful POST — see the
// `flush_once` comment for the tradeoff.
#![cfg_attr(not(target_os = "windows"), allow(dead_code))]
use anyhow::{anyhow, Result};
use serde::Deserialize;
use std::sync::Mutex;
use std::time::Duration;
/// How often to scrape. 5 min keeps the UI freshness reasonable
/// (operators looking at a complaint won't usually see a stale view)
/// without burning host CPU on PowerShell startup every minute.
const SCRAPE_INTERVAL: Duration = Duration::from_secs(5 * 60);
const FLUSH_INTERVAL_BASE: Duration = Duration::from_secs(60);
const FLUSH_INTERVAL_MAX: Duration = Duration::from_secs(15 * 60);
/// First-run lookback. A box that's been alive for months would
/// otherwise dump thousands of `Diagnostics-Performance` events on a
/// fresh install — useful in theory, but the UI shows the most recent
/// 20 and the older entries are mostly noise.
const MAX_FIRST_RUN_EVENTS: u32 = 100;
/// Per-scrape cap so a misbehaving event log can't blow up an
/// individual run. Anything beyond this on a single pass is dropped on
/// the floor and picked up on the next scrape (cursor advances to the
/// last seen, so we don't oscillate).
const MAX_PER_SCRAPE: u32 = 200;
/// Must match the server's MAX_EVENTS_PER_POST.
const MAX_EVENTS_PER_POST: usize = 128;
/// Drop pending events after this many retries — at 15 min cap this
/// is ~5 days, plenty for any realistic outage window.
const DROP_AFTER: u32 = 480;
/// One channel config: the WEL log name, the short provider tag stored
/// server-side (matches the `devices.perf_src_*` i18n keys), and an
/// optional ID allow-list. `None` means "everything in this channel";
/// `Some(&[…])` restricts to those event IDs (used for `System` to
/// avoid pulling boot / service-start chatter).
struct ChannelCfg {
provider: &'static str,
log_name: &'static str,
event_ids: Option<&'static [u32]>,
}
const CHANNELS: &[ChannelCfg] = &[
ChannelCfg {
provider: "diag-perf",
log_name: "Microsoft-Windows-Diagnostics-Performance/Operational",
event_ids: None,
},
ChannelCfg {
provider: "res-exh",
log_name: "Microsoft-Windows-Resource-Exhaustion-Detector/Operational",
event_ids: None,
},
ChannelCfg {
// Subset of `System` that's actually performance-relevant —
// 41 = Kernel-Power unexpected reboot, 6008 = dirty shutdown,
// 1001 = BugCheck (BSOD report). Adding more IDs is just a
// matter of extending this slice.
provider: "system",
log_name: "System",
event_ids: Some(&[41, 6008, 1001]),
},
];
/// JSON shape we ask PowerShell to emit. Mirrors `PerfEventIn` on the
/// server.
#[derive(Debug, Clone, Deserialize)]
struct RawEvent {
at: i64,
event_id: i64,
level: i64,
record_id: i64,
#[serde(default)]
summary: String,
#[serde(default)]
detail: String,
}
#[derive(Clone, Debug)]
struct PendingPerfEvent {
provider: &'static str,
at: i64,
event_id: i64,
level: i64,
record_id: i64,
summary: String,
detail: String,
attempts: u32,
}
#[cfg(target_os = "windows")]
static QUEUE: Mutex<Vec<PendingPerfEvent>> = Mutex::new(Vec::new());
pub fn start() {
#[cfg(not(target_os = "windows"))]
{}
#[cfg(target_os = "windows")]
{
use std::sync::atomic::{AtomicBool, Ordering};
static STARTED: AtomicBool = AtomicBool::new(false);
if STARTED.swap(true, Ordering::SeqCst) {
return;
}
std::thread::spawn(move || {
let rt = match tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
{
Ok(rt) => rt,
Err(e) => {
log::warn!("perf-events: build runtime: {e}");
return;
}
};
rt.block_on(run_loop());
});
}
}
#[cfg(target_os = "windows")]
async fn run_loop() {
// Stagger the first scrape so we don't pile on the
// unattended_password POST + login_events first poll + sysinfo
// upload all at the same boot moment. PowerShell first invocation
// is ~1 s of CPU; doing it 30 s in instead of immediately is
// friendlier on cold-boot CPU load.
tokio::time::sleep(Duration::from_secs(30)).await;
let mut flush_backoff = FLUSH_INTERVAL_BASE;
loop {
for ch in CHANNELS {
match scrape_channel(ch) {
Ok(events) if !events.is_empty() => {
log::info!(
"perf-events: {} new from {}",
events.len(),
ch.provider,
);
QUEUE.lock().unwrap().extend(events);
}
Ok(_) => {}
Err(e) => {
log::warn!(
"perf-events: scrape {} failed: {e:#}",
ch.provider,
);
}
}
}
match flush_once().await {
FlushOutcome::Idle | FlushOutcome::AllSent => {
flush_backoff = FLUSH_INTERVAL_BASE;
}
FlushOutcome::Failed => {
flush_backoff = (flush_backoff * 2).min(FLUSH_INTERVAL_MAX);
}
}
tokio::time::sleep(SCRAPE_INTERVAL.min(flush_backoff)).await;
}
}
// ─────────────────────────── per-channel scrape ───────────────────────────
#[cfg(target_os = "windows")]
fn scrape_channel(ch: &ChannelCfg) -> Result<Vec<PendingPerfEvent>> {
let cursor = read_cursor(ch.provider);
let script = build_script(ch, cursor);
let stdout = run_powershell(&script)?;
if stdout.is_empty() {
return Ok(Vec::new());
}
// ConvertTo-Json with a single-element array still emits a JSON
// object (PowerShell's "unrolling" quirk); coerce both shapes.
let trimmed = stdout.trim();
if trimmed == "[]" || trimmed.is_empty() {
return Ok(Vec::new());
}
let raw: Vec<RawEvent> = match hbb_common::serde_json::from_str(trimmed) {
Ok(v) => v,
Err(_) => {
// Single-object case
match hbb_common::serde_json::from_str::<RawEvent>(trimmed) {
Ok(single) => vec![single],
Err(e) => {
return Err(anyhow!(
"PowerShell output is not valid JSON: {e}; first 200 chars: {:.200}",
trimmed
));
}
}
}
};
let mut max_record_id = cursor;
let mut out = Vec::with_capacity(raw.len());
for ev in raw {
if (ev.record_id as u64) > max_record_id {
max_record_id = ev.record_id as u64;
}
out.push(PendingPerfEvent {
provider: ch.provider,
at: ev.at,
event_id: ev.event_id,
level: ev.level,
record_id: ev.record_id,
summary: ev.summary,
detail: ev.detail,
attempts: 0,
});
}
// Advance the cursor on observation, not on successful POST. The
// tradeoff: an agent that crashes between observing and POSTing
// loses those rows from the UI. Windows still has them in the
// event log, so the operator can fall back to Event Viewer if
// they really need them; we prefer that to repeatedly re-observing
// a backlog the server has already taken (unique-index dedup
// would absorb it, but the agent's queue would grow unbounded
// every scrape).
if max_record_id > cursor {
write_cursor(ch.provider, max_record_id);
}
Ok(out)
}
/// Build the PowerShell script for one channel. Two shapes:
///
/// * `cursor == 0` (first run): `FilterHashtable` with a 7-day
/// `StartTime` and the optional ID allow-list. The hashtable form
/// is the only one that accepts both a time bound and an `Id`
/// array in one call.
/// * `cursor > 0`: `FilterXPath` with `EventRecordID > $cursor` and
/// the optional ID-list expanded to `(EventID=A or EventID=B …)`.
#[cfg(target_os = "windows")]
fn build_script(ch: &ChannelCfg, cursor: u64) -> String {
// PowerShell-quote the log name (single-quote escape = doubled
// single quote).
let q = |s: &str| s.replace('\'', "''");
let log_name_quoted = q(ch.log_name);
let filter_clause = if cursor == 0 {
// FilterHashtable + StartTime + optional Id array.
match ch.event_ids {
Some(ids) => {
let id_list = ids
.iter()
.map(|i| i.to_string())
.collect::<Vec<_>>()
.join(",");
format!(
"-FilterHashtable @{{LogName='{ln}'; StartTime=(Get-Date).AddDays(-7); Id=@({ids})}} -MaxEvents {max}",
ln = log_name_quoted,
ids = id_list,
max = MAX_FIRST_RUN_EVENTS,
)
}
None => format!(
"-FilterHashtable @{{LogName='{ln}'; StartTime=(Get-Date).AddDays(-7)}} -MaxEvents {max}",
ln = log_name_quoted,
max = MAX_FIRST_RUN_EVENTS,
),
}
} else {
// FilterXPath. Note PowerShell expands `$cursor` from the
// outer script, so we splat the literal value into the XPath.
let id_clause = match ch.event_ids {
Some(ids) => {
let or = ids
.iter()
.map(|i| format!("System/EventID={i}"))
.collect::<Vec<_>>()
.join(" or ");
format!(" and ({or})")
}
None => String::new(),
};
// Double-curly to escape format!'s own `{}` interpolation.
format!(
"-LogName '{ln}' -FilterXPath \"*[System/EventRecordID>{cur}{id}]\" -MaxEvents {max}",
ln = log_name_quoted,
cur = cursor,
id = id_clause,
max = MAX_PER_SCRAPE,
)
};
// Single-quoted PowerShell here-doc would escape too aggressively;
// we stick to plain string concatenation. The `try / catch` block
// returns '[]' on any failure so the Rust side gets a parseable
// empty array rather than a stderr blob.
format!(
r#"$ErrorActionPreference = 'SilentlyContinue'
try {{
$events = Get-WinEvent {filter} 2>$null
if ($null -eq $events) {{ '[]' ; exit 0 }}
$arr = @($events | Sort-Object RecordId | ForEach-Object {{
$msg = $_.Message
if ($null -eq $msg) {{ $msg = '' }}
$oneline = ($msg -replace "(`r?`n)+", " ").Trim()
if ($oneline.Length -gt 300) {{ $oneline = $oneline.Substring(0,300) }}
$detail = $msg
if ($detail.Length -gt 4000) {{ $detail = $detail.Substring(0,4000) }}
[PSCustomObject]@{{
at = [int64](([System.DateTimeOffset]$_.TimeCreated.ToUniversalTime()).ToUnixTimeSeconds())
event_id = $_.Id
level = $_.Level
record_id = $_.RecordId
summary = $oneline
detail = $detail
}}
}})
$arr | ConvertTo-Json -Depth 3 -Compress
}} catch {{
'[]'
}}
"#,
filter = filter_clause,
)
}
#[cfg(target_os = "windows")]
fn run_powershell(script: &str) -> Result<String> {
use std::os::windows::process::CommandExt;
use std::process::Command;
// Matches inventory.rs — hides the brief console flash when the
// agent runs interactively (dev mode); no effect in service mode.
const CREATE_NO_WINDOW: u32 = 0x08000000;
let output = Command::new("powershell.exe")
.args([
"-NoProfile",
"-NonInteractive",
"-ExecutionPolicy",
"Bypass",
"-Command",
script,
])
.creation_flags(CREATE_NO_WINDOW)
.output()
.map_err(|e| anyhow!("spawn powershell: {e}"))?;
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
return Err(anyhow!(
"powershell exited {:?}: {}",
output.status.code(),
stderr.trim()
));
}
Ok(String::from_utf8_lossy(&output.stdout).trim().to_string())
}
// ─────────────────────────────── cursor I/O ───────────────────────────────
#[cfg(target_os = "windows")]
fn cursor_key(provider: &str) -> String {
format!("perf-event-cursor-{provider}")
}
#[cfg(target_os = "windows")]
fn read_cursor(provider: &str) -> u64 {
let raw = hbb_common::config::Config::get_option(&cursor_key(provider));
raw.parse::<u64>().unwrap_or(0)
}
#[cfg(target_os = "windows")]
fn write_cursor(provider: &str, value: u64) {
hbb_common::config::Config::set_option(
cursor_key(provider),
value.to_string(),
);
}
// ───────────────────────────────── flush ──────────────────────────────────
#[cfg(target_os = "windows")]
enum FlushOutcome {
Idle,
AllSent,
Failed,
}
#[cfg(target_os = "windows")]
async fn flush_once() -> FlushOutcome {
let batch: Vec<PendingPerfEvent> = {
let mut q = QUEUE.lock().unwrap();
if q.is_empty() {
return FlushOutcome::Idle;
}
let take = q.len().min(MAX_EVENTS_PER_POST);
q.drain(..take).collect()
};
match post_batch(&batch).await {
Ok(()) => FlushOutcome::AllSent,
Err(e) => {
log::warn!(
"perf-events: flush of {} event(s) failed: {e:#}",
batch.len(),
);
let mut requeued: Vec<PendingPerfEvent> = batch
.into_iter()
.filter_map(|mut ev| {
ev.attempts = ev.attempts.saturating_add(1);
if ev.attempts >= DROP_AFTER {
log::warn!(
"perf-events: dropping event after {} attempts: \
provider={} record_id={}",
ev.attempts, ev.provider, ev.record_id,
);
None
} else {
Some(ev)
}
})
.collect();
let mut q = QUEUE.lock().unwrap();
let tail: Vec<PendingPerfEvent> = q.drain(..).collect();
requeued.extend(tail);
*q = requeued;
FlushOutcome::Failed
}
}
}
#[cfg(target_os = "windows")]
async fn post_batch(batch: &[PendingPerfEvent]) -> Result<()> {
let api = librustdesk::common::get_api_server(
hbb_common::config::Config::get_option("api-server"),
hbb_common::config::Config::get_option("custom-rendezvous-server"),
);
if api.is_empty() {
return Err(anyhow!("no api-server configured yet"));
}
let url = format!("{api}/api/agent/perf-events");
let id = hbb_common::config::Config::get_id();
let uuid = librustdesk::common::encode64(hbb_common::get_uuid());
let events: Vec<hbb_common::serde_json::Value> = batch
.iter()
.map(|ev| {
hbb_common::serde_json::json!({
"at": ev.at,
"provider": ev.provider,
"event_id": ev.event_id,
"level": ev.level,
"record_id": ev.record_id,
"summary": ev.summary,
"detail_json": ev.detail,
})
})
.collect();
let body = hbb_common::serde_json::json!({
"id": id,
"uuid": uuid,
"events": events,
})
.to_string();
let headers = librustdesk::hbbs_http::sign::build_signed_headers(
"POST",
"/api/agent/perf-events",
body.as_bytes(),
)
.unwrap_or_default();
let resp = librustdesk::common::post_request(url, body, &headers)
.await
.map_err(|e| anyhow!("post: {e}"))?;
let trimmed = resp.trim();
if trimmed == "OK" || trimmed == "ID_NOT_FOUND" {
Ok(())
} else {
Err(anyhow!("unexpected response: {trimmed}"))
}
}
+21
View File
@@ -711,6 +711,27 @@ fn service_main_inner() -> Result<()> {
// shutdown hook (the SCM termination is enough). // shutdown hook (the SCM termination is enough).
crate::login_events::start(); crate::login_events::start();
// Start the continuous performance sampler: one CPU / memory /
// top-process sample per minute, posted in batches to
// /api/agent/metrics. Powers the device-detail Performance card
// and 24 h sparkline.
crate::perf::start();
// Start the Windows-event-log perf-event scraper: pulls boot /
// shutdown / sleep degradation, memory exhaustion, BSOD and
// unexpected-reboot events from the OS-managed channels and
// POSTs them to /api/agent/perf-events. Persists a per-channel
// RecordId cursor in the agent config so a restart doesn't
// re-emit the whole history.
crate::perf_events::start();
// Start the network-change monitor: polls public IP, per-interface LAN
// IPv4, and Wi-Fi SSID/BSSID once a minute, diffs against the persisted
// snapshot, and POSTs only changes to /api/agent/network-events. Powers
// the device Network-history tab and the fleet network log.
#[cfg(target_os = "windows")]
crate::net_monitor::start();
// Worker process handle. Killed on Stop, replaced on session change. // Worker process handle. Killed on Stop, replaced on session change.
// `last_state` carries (session_id, had_user). The `had_user` bit is // `last_state` carries (session_id, had_user). The `had_user` bit is
// what forces a respawn when a user logs in to a session we're // what forces a respawn when a user logs in to a session we're
+36
View File
@@ -184,6 +184,15 @@ async fn start_hbbs_sync_async() {
v["version"] = json!(crate::VERSION); v["version"] = json!(crate::VERSION);
v["id"] = json!(id); v["id"] = json!(id);
v["uuid"] = json!(crate::encode64(hbb_common::get_uuid())); v["uuid"] = json!(crate::encode64(hbb_common::get_uuid()));
// opsbase enrollment: advertise our Ed25519 public key so the
// server can pin it trust-on-first-use and verify our signed
// requests. This is the same keypair `sign.rs` signs with and
// rendezvous registers. Harmless on vanilla rustdesk servers,
// which ignore unknown sysinfo fields.
let (_sk, pk_bytes) = Config::get_key_pair();
if !pk_bytes.is_empty() {
v["pk"] = json!(crate::encode64(&pk_bytes));
}
// Optional rebrand identity: `AGENT_NAME` / `AGENT_VERSION` // Optional rebrand identity: `AGENT_NAME` / `AGENT_VERSION`
// are empty by default (vanilla rustdesk) and populated by // are empty by default (vanilla rustdesk) and populated by
// OEM shells like hello-agent. We only stamp the field // OEM shells like hello-agent. We only stamp the field
@@ -404,9 +413,28 @@ fn heartbeat_url() -> String {
format!("{}/api/heartbeat", url) format!("{}/api/heartbeat", url)
} }
/// LocalConfig key holding the JSON list of option keys the server's strategy
/// applied on the previous push, so we can make strategies *authoritative*.
const STRATEGY_MANAGED_KEYS: &str = "strategy_managed_keys";
fn handle_config_options(config_options: HashMap<String, String>) { fn handle_config_options(config_options: HashMap<String, String>) {
let mut options = Config::get_options(); let mut options = Config::get_options();
let default_settings = config::DEFAULT_SETTINGS.read().unwrap().clone(); let default_settings = config::DEFAULT_SETTINGS.read().unwrap().clone();
// hello-agent local patch — authoritative strategies. The server's strategy
// owns exactly the keys it sends. Any key it managed on the previous push
// but no longer sends is reset to its default (the override is dropped), so
// removing a key/strategy on the server actually clears it on the device.
// Keys the server never managed (the user's own local settings) are left
// untouched, so this stays a managed-config overlay rather than a wipe.
let prev_managed: Vec<String> =
serde_json::from_str(&LocalConfig::get_option(STRATEGY_MANAGED_KEYS)).unwrap_or_default();
for k in prev_managed.iter() {
if !config_options.contains_key(k) {
options.remove(k);
}
}
config_options config_options
.iter() .iter()
.map(|(k, v)| { .map(|(k, v)| {
@@ -420,6 +448,14 @@ fn handle_config_options(config_options: HashMap<String, String>) {
} }
}) })
.count(); .count();
// Remember the keys we now manage, for the next diff.
let managed: Vec<&String> = config_options.keys().collect();
LocalConfig::set_option(
STRATEGY_MANAGED_KEYS.to_string(),
serde_json::to_string(&managed).unwrap_or_default(),
);
Config::set_options(options); Config::set_options(options);
} }