BNL Logo

Scientific Computing and Data Facility

Recent Announcements

Click on an announcement title below to expand and view more information.

10/21/2025

IT Services

Group Responsible: IT Services

Affected Area: OpenShift Virtual Enviroment

Maintenance Type: service interruption

An upgrade was begun today on OpenShift. Though we were informed that it would be transparent to all, we have found that some network interruptions are occurring. The upgrade cannot be interrupted once it has started. The Main servers have completed, and we do not expect any further interruptions. However, there is a possibility of further interruption to services as the upgrade proceeds.


Submitted by: Joe Frith <jfrith@bnl.gov>

04/26/2025

IT Services

Group Responsible: IT Services

Affected Area: SCDF Services

Maintenance Type: Unknown

The SDCC recently unveiled its new website http://www.sdcc.bnl.gov that serves as the entry point to facility services and support. In addition, we have created a public MatterMost channel for feedback and recommendations on the new website. Users are welcome to join this channel (see link below) and participate.\n\nhttps://chat.sdcc.bnl.gov/bnl/channels/sdcc-website-feedback\n\nPlease note that this channel is not meant to be used for support issues (broken links, missing documentation, request for changes, etc). Support requests must be made through the RT ticket system (go to http://www.sdcc.bnl.gov and select 'Get Help')


Submitted by: Tony Wong

04/26/2025

Network

Group Responsible: Network

Affected Area: SCDF Services

Maintenance Type: Unknown

Hi all,\n\nThe clusters are back to normal operations starting 12:30 pm.\n\nThe system affected were IC cluster, Skylake cluster and the volta cluster.\n\nRegards,\nCostin


Submitted by: Costin Caramarcu

04/26/2025

Services & Tools

Group Responsible: Services & Tools

Affected Area: SCDF Communications

Maintenance Type: service interruption

This is just testing the db write


Submitted by: Facility Staff <unknown@example.com>

02/12/2024

Services & Tools

Group Responsible: Services & Tools

Affected Area: Star GPFS

Maintenance Type: Unknown

Access to Star GPFS (gpfs/gpfs01) has been restored. End Date/Time: 2/12/2024 10:01 pm Expected Impact: access is available


Submitted by: Test User

02/12/2024

IT Services

Group Responsible: IT Services

Affected Area: Star GPFS

Maintenance Type: Scheduled Downtime

The Star GPFS File System experienced serious errors over the weekend and is currently Offline. We have opened a ticket with the Vendor and are working to get this resolved. Will send updates as available. End Date/Time: 2/13/2024 7:27 am Expected Impact: access is unavailable until issue is resolved


Submitted by: Test User

01/22/2024

Experimental Support

Group Responsible: Experimental Support

Affected Area: US ATLAS dCache storage service

Maintenance Type: Service Interruption

The US ATLAS storage system is scheduled for an upgrade to the latest software release, dCache 9.2. Consequently, this maintenance process will result in a temporary interruption of accessibility to the storage system from 9:00 AM to 1:00 PM (EST) on January 22, 2024. We apologize in advance for any inconvenience this work may cause. End Date/Time: 1/22/2024 1:00 pm Expected Impact: US ATLAS dCache storage service interruption


Submitted by: Test User

01/05/2024

IT Fabric

Group Responsible: IT Fabric

Affected Area: EIC Lustre

Maintenance Type: Scheduled Downtime

eicoss02.sdcc.bnl.local is back online after outage. End Date/Time: 1/5/2024 5:15 pm Expected Impact: Lustre file system for EIC won&#039;t be available.


Submitted by: Test User

01/05/2024

Services & Tools

Group Responsible: Services & Tools

Affected Area: EIC Lustre

Maintenance Type: Scheduled Downtime

Due to hardware issues, eicoss02.sdcc.bnl.local needs to be brought offline. End Date/Time: 2024-01-05T17:00 Expected Impact: Lustre file system for EIC won't be availble.


Submitted by: value=

12/19/2023

Experimental Support

Group Responsible: Experimental Support

Affected Area: SCDF Services

Maintenance Type: Service Interruption

On Tuesday, December 19th, from 6:30 am EST to 9 pm EST, network connectivity (both WAN and Campus) to/from the SDCC Facility and Science DMZ will be unavailable due to scheduled maintenance. This event marks the final phase of the years-long migration of SDCC services from the old data center to the new one. We anticipate the full restoration of network connectivity by 9 pm EST on December 19th, concluding the outage. During the intervention period, access to SDCC services, including computing resources (both interactive and batch), storage services (disk and tape), and collaborative tools (such as BNLBox, RCF email, MatterMost, and web services), will not be available. To prepare for this network outage, the scheduling of new HTCondor and Slurm batch jobs will cease on Friday, December 15th, at 11 pm EST. This pause will facilitate the smooth draining and graceful termination of ongoing computing jobs. In addition, access to storage and data transfer services will be temporarily halted on Monday, December 18th, at 5 pm EST. This outage does not affect BNL mail service (@bnl.gov domain), which will be available during this period. The SDCC will notify the community about the full restoration of services through its regular mailing lists. End Date/Time: 12/19/2023 9:00 pm Expected Impact: No access to SDCC services during downtime


Submitted by: Test User

12/18/2023

Network

Group Responsible: Network

Affected Area: SCDF Services

Maintenance Type: Service Interruption

This extended downtime is necessary for two key activities: upgrading user home directories beginning on Monday (for all programs but the NSLS) and a network maintenance scheduled for Tuesday (affecting all customers). This maintenance marks the final phase of the multi-year migration of SDCC services to a new data center. Impact on Services: - Storage and Data Transfer Services: Access will be suspended starting Monday, December 18th, at 5 PM EST to allow for a clone of the user&amp;rsquo;s home directories. Special Note for NSLS Program Users: The NFS home directory work scheduled from Monday to Tuesday does not impact the NSLS program. NSLS program users should expect regular operation during this period (until the network maintenance begins on Tuesday). - Computing Resources: Access to computing resources (interactive and batch), storage services (disk and tape), and collaborative tools (BNLBox, RCF email, MatterMost, web services) will be unavailable throughout the respective maintenance periods. - Batch Job Scheduling: Scheduling of new HTCondor and Slurm batch jobs will cease on Friday, December 15th, at 11 PM EST. Any remaining HTCondor jobs will be terminated on Monday, December 18th, at 3 PM. - Interactive Sessions: Open sessions through the facility SSH gateways will be terminated at 5 PM on December 18th to facilitate the cloning/copying of user home directories. - Email Services: The @bnl.gov email services may experience interruptions. Alternatively, users are encouraged to use the sdcc-staff-l@lists.bnl.gov mailing list to contact SDCC staff. - NX Service and SSH Gateway: The NX service and SSH Gateway will be unavailable. Existing cssh sessions initiated before the outage will continue to function. Restoration of Services: Complete restoration of services is anticipated by 9 PM EST on December 19th. The SDCC will notify the community through its regular mailing lists. We apologize for the inconvenience this may cause and thank you for your understanding and cooperation as we complete this essential upgrade to our infrastructure. End Date/Time: 12/19/2023 9:00 pm Expected Impact: No access to SDCC services during downtime


Submitted by: Test User

11/20/2023

Experimental Support

Group Responsible: Experimental Support

Affected Area: SCDF Mattermost

Maintenance Type: Service Interruption

The SDCC Mattermost service (chat.sdcc.bnl.gov) will be suspended on Monday, November 20, 2023, from 6PM EST to 7PM EST End Date/Time: 11/20/2023 7:00 pm Expected Impact: Service will be suspended


Submitted by: Test User

10/13/2023

Network

Group Responsible: Network

Affected Area: SFTP Services

Maintenance Type: Service Interruption

Due to a security vulnerability in Red Hat Linux the SFTP &amp;amp; CFTP Servers will be updated and rebooted between 12:30 PM EST and 1:00PM EST today. This will result in existing connections being disconnected during the reboots. End Date/Time: 10/13/2023 1:30 pm Expected Impact: aacces wil unavailable and current sessions ended


Submitted by: Test User

10/03/2023

Network

Group Responsible: Network

Affected Area: Globus

Maintenance Type: Service Interruption

DTN03/IC Globus maintenance on 10/03/23, server will be down for approx. 1-2 hours as it is relocated in the Data Center End Date/Time: 10/3/2023 3:30 pm Expected Impact: aacces wil unavailable and current sessions ended


Submitted by: Test User

09/26/2023

Network

Group Responsible: Network

Affected Area: NX Service

Maintenance Type: Service Interruption

The NX sessions on nxterm and nxcampus servers will be terminated. Please save your work. End Date/Time: 9/26/2023 12:00 pm Expected Impact: NX sessions on nxterm and nxcampus servers will be terminated


Submitted by: Test User

09/26/2023

IT Fabric

Group Responsible: IT Fabric

Affected Area: NX Service

Maintenance Type: Service Interruption

The NX sessions on nxterm and nxcampus servers will be terminated. Please save your work. End Date/Time: 9/26/2023 12:00 pm Expected Impact: NX sessions on nxterm and nxcampus servers will be terminated


Submitted by: Test User

06/06/2023

Experimental Support

Group Responsible: Experimental Support

Affected Area: PHENIX interactive nodes

Maintenance Type: Service Interruption

A portion of the PHENIX interactive nodes, rcas2069-2076, will be shutdown and repurposed/renamed from PHENIX to sPHENIX nodes on Tuesday 6/6 at 10:00 AM. Any locally stored data on these nodes (files in /home, /tmp, and /var/tmp) will be destroyed as part of the system rebuild process. Please be sure to logout and transfer any local files elsewhere before the scheduled shutdown. rcas2061-2068 will remain available. Expected Impact: A portion of the PHENIX interactive nodes will be unavailable


Submitted by: Chris Hollowell

05/09/2023

IT Services

Group Responsible: IT Services

Affected Area: SCDF Mattermost

Maintenance Type: Service Interruption

The SDCC Mattermost service at chat.sdcc.bnl.gov will be turned off temporarily for an upgrade, during this time all users will be unable to access the service and will receive an error that they can not connect to the server. Once the upgrade is complete users should be able to resume activities without interruption. End Date/Time: 5/9/2023 9:00 pm Expected Impact: Unable to connect to chat.sdcc.bnl.gov


Submitted by: Louis Pelosi

05/01/2023

IT Services

Group Responsible: IT Services

Affected Area: sPHENIX Lustre System

Maintenance Type: Service Interruption

We are planning a major upgrade for the sPHENIX system, scheduled to start on Monday, May 1st at 9:00 AM and continue for three days until Wednesday, May 3rd. This upgrade will involve transitioning the system from its current version, 2.12.8, to the new version 2.15.2. Please note that during the upgrade period, the sPHENIX system will be inaccessible to users. We kindly request that you plan your work accordingly and ensure all necessary data is saved prior to the upgrade. Be aware that any running or queued jobs at the time of the upgrade may be affected, and we recommend either completing or rescheduling them beforehand. It is important to note that starting at 9 AM on May 1st, all unfinished jobs will be terminated. End Date/Time: 5/3/2023 11:59 pm Expected Impact: The sPHENIX Lustre system unavailable for user access


Submitted by: Jane Liu

05/01/2023

IT Services

Group Responsible: IT Services

Affected Area: sPHENIX Lustre System

Maintenance Type: Service Interruption

The sPHENIX Lustre upgrade window has been extended until the end of May 4th due to an unexpected bug in the new version. Unfortunately, we have hit a bug that is causing issues with the Object Index files, and we are currently working to resolve the issue. We understand that this may cause inconvenience to your work, and we apologize for the delay. The team is running the necessary commands to fix the Object Index files, and we will keep you updated on our progress. End Date/Time: 5/4/2023 11:59 pm Expected Impact: The sPHENIX Lustre system unavailable for user access


Submitted by: Zhenping Liu