BNL Logo

Scientific Computing and Data Facility

Recent Announcements

Click on an announcement title below to expand and view more information.

05/01/2026

IT Services

Group Responsible: IT Services

Affected Area: SSH logins for SCDF Staff will be interrupted.

Maintenance Type: service interruption

Due to the Linux kernel issue explained in CVE-2026-31431, we will be scheduling a reboot of admingw01&02 and staffgw01&02. This will cause connections to drop during the reboot process. Interruption time should be minimal.


Submitted by: Joe Frith <jfrith@bnl.gov>

04/30/2026

IT Fabric

Group Responsible: IT Fabric

Affected Area: SCDF managed interactive/submit nodes

Maintenance Type: service interruption

To remediate a significant linux vulnerability, all SCDF managed interactive submit nodes will be rebooted shortly with a ~10min warning mesg. Submitted condor jobs will continue running and reconnect with the submit node after the reboot. For worker nodes, we're working on a solution to balance prompt response with minimizing disruption of running jobs as much as feasible. Interactive logins and processes will be terminated when the int/sub nodes reboot, once the nodes are back up, you should be able to continue your work. Thank you for your patience and understanding.


Submitted by: Matt Cowan <cowan@bnl.gov>

04/02/2026

Service & Tools

Group Responsible: Service & Tools

Affected Area: BNLBox

Maintenance Type: scheduled downtime

BNLBox service will be down for upto 1 hour due to its maintenance.


Submitted by: Hironori Ito <hito@bnl.gov>

04/01/2026

IT Services

Group Responsible: IT Services

Affected Area: RT

Maintenance Type: service interruption

The RT VM will be migrated off the RHEV platform. The RT web site will be unavailable during the migration but emails will be queued and delivered once the migration is complete. This will begin at 8:30am on Wednesday April 1st and is expected to take 3-4 hours. An update will be sent when the migration is complete.


Submitted by: Mark Berry <mberry@bnl.gov>

03/25/2026

Service & Tools

Group Responsible: Service & Tools

Affected Area: BNLBox

Maintenance Type: transparent

The scheduled BNLBox upgrade for today has been canceled. A new announcement will be issued once downtime has been rescheduled.


Submitted by: Louis Pelosi <lpelosi@bnl.gov>

03/25/2026

Service & Tools

Group Responsible: Service & Tools

Affected Area: BNLBox

Maintenance Type: scheduled downtime

BNLBox will be unavailable on 03/25/2026 starting at 08:15 AM EST for approximately one hour to perform a scheduled upgrade. During this time, users will experience a temporary disruption in service. We apologize for the inconvenience and appreciate your patience.


Submitted by: Louis Pelosi <lpelosi@bnl.gov>

03/02/2026

IT Services

Group Responsible: IT Services

Affected Area: Atlas GPFS

Maintenance Type: transparent

We will be upgrading the Atlas GPFS servers (atlasgpfs01) on Monday 3/2/2026 beginning @10:00 AM. This should be a transparent upgrade to the RHEL and GPFS software. If there are any issues, please open an RT ticket for these to be addressed.


Submitted by: Joe Frith <jfrith@bnl.gov>

02/19/2026

IT Services

Group Responsible: IT Services

Affected Area: HPC GPFS & RHEL 8

Maintenance Type: transparent

We will be upgrading the HPC/IC2 GPFS servers (hpcgpfs01) beginning @ 9:30AM. This should be a transparent upgrade to the RHEL and GPFS software. If there are any issues, please open an RT ticket for these to be addressed.


Submitted by: Joe Frith <jfrith@bnl.gov>

02/17/2026

Network

Group Responsible: Network

Affected Area: Access to selected storage systems at the SCDF

Maintenance Type: service interruption

A change in the configuration of the storage network in the SCDF legacy data center (Bldg 515/CDCE) will result in a brief disruption in connectivity to storage systems located in the legacy data center from other systems at the SCDF. This configuration change is scheduled for 9:30AM EST on Tuesday February 17.


Submitted by: Shigeki Misawa <misawa@bnl.gov>

02/17/2026

Network

Group Responsible: Network

Affected Area: Connectivity to the SCDF from the general internet

Maintenance Type: transparent

The SCDF firewall, separating the SCDF facility from the general internet and the BNL campus network, will be upgraded in a two steps. Step one - Feb 17 at 6:00AM EST - This step is expected to be transparent as existing network connections through the firewall are expected to remain intact. Step two - Feb 18 at 6:00AM EST - This step will break existing network connections through the firewall. Users will need to re-establish network connections if they break.


Submitted by: Shigeki Misawa <misawa@bnl.gov>

02/02/2026

IT Services

Group Responsible: IT Services

Affected Area: AFS services

Maintenance Type: service interruption

After discussing with Atlas Experiment it's no longer using the AFS file servers. We are planning to retire/shutdown all Alas AFS service on 2/2/2026. If there are any files still being accessed, please move them to a new location.


Submitted by: Joe Frith <jfrith@bnl.gov>

01/21/2026

IT Fabric

Group Responsible: IT Fabric

Affected Area: b725 datacenter

Maintenance Type: service interruption

Systems have been operating stably since the cutover to site chilled water. SCDF is closely monitoring and confirming functionality of major systems one by one. We will continue to run in the current state, with reduced capacities for some systems, until we we have more info from F&O and are recommended to return all systems to normal operations. We believe all major systems are otherwise functioning; please report any issues.


Submitted by: Matt Cowan <cowan@bnl.gov>

01/21/2026

IT Fabric

Group Responsible: IT Fabric

Affected Area: b725 datacenter

Maintenance Type: service interruption

A major leak in one of the cooling tower loops triggered a temperature rise in a number of compute racks (main impact appears limited to low density compute rows 103 and 105, impacting sPHENIX, and to a smaller extent the shared pool and ATLAS Tier1), before the backup BNL site chilled water feed kicked in. It's currently believed all systems should be functioning, albeit sPHENIX, spool and ATLAS T1 are currently running with reduced capacities. Please report any issues.


Submitted by: Matt Cowan <cowan@bnl.gov>

01/16/2026

IT Services

Group Responsible: IT Services

Affected Area: All Comanage Federated logins

Maintenance Type: scheduled downtime

Comanage Registry is undergoing an upgrade and will be unavailable for Federated Logins for 15 min at 6 am Central Time on Friday January 16th. This will affect a variety of services but will be transparent if you are already logged into those services.


Submitted by: Robert Hancock <hancock@bnl.gov>

01/06/2026

IT Services

Group Responsible: IT Services

Affected Area: Globus Access will be down

Maintenance Type: scheduled downtime

Globus Access (https://app.globus.org/) will be down for maintenance from 11AM-3PM today 01/06/26.


Submitted by: Joe Frith <jfrith@bnl.gov>

12/30/2025

Service & Tools

Group Responsible: Service & Tools

Affected Area: BNLBox

Maintenance Type: scheduled downtime

The BNLBox update has been completed, if there are any issues please submit tickets to the RT Storage Management queue at RT-RACF-StorageManagement@bnl.gov


Submitted by: Louis Pelosi <lpelosi@bnl.gov>

12/30/2025

Service & Tools

Group Responsible: Service & Tools

Affected Area: BNLBox

Maintenance Type: scheduled downtime

BNLBox will be unavailable starting at 11:00 AM EST on December 30, 2025, due to scheduled maintenance. During this time, the service will be inaccessible. Users should complete any work before the maintenance window begins. Normal operations are expected to resume once maintenance is complete, on or before 5:00 PM EST on December 30, 2025. Users may resume usage at that time. Thank you for your patience during this scheduled downtime.


Submitted by: Louis Pelosi <lpelosi@bnl.gov>

12/08/2025

IT Fabric

Group Responsible: IT Fabric

Affected Area: b725 Network and Tape rooms

Maintenance Type: transparent

FandO is performing inspection and remediation work on PDU3A4 feeding the Network and Tape rooms in the b725 datacenter today ~9:30am-3pm. All equipment is powered redundantly from another PDU, no outage is expected, but systems will be in a more at-risk state than normal.


Submitted by: Matt Cowan <cowan@bnl.gov>

12/05/2025

Service & Tools

Group Responsible: Service & Tools

Affected Area: SCDF Mattermost

Maintenance Type: service interruption

SCDF Mattermost service has been updated and service has resumed, thank you for your patience. If there are any issues please report problems by opening an RT ticket in the Software queue 'RT-RACF-Software@bnl.gov'


Submitted by: Louis Pelosi <lpelosi@bnl.gov>

12/05/2025

Service & Tools

Group Responsible: Service & Tools

Affected Area: SCDF Mattermost

Maintenance Type: service interruption

An emergency update needs to be applied to the SCDF Mattermost service, users should not experience any interruption to usage, however if there is an issue the service will be down temporarily. This update will fix an issue where users are unable to retrieve group messages using the '+' button from the sidebar and will update the webui. Apologies for the interruption and thank you for your patience.


Submitted by: Louis Pelosi <lpelosi@bnl.gov>