Best Practices
This SDN article explains how to get started with the OpenSolaris OS, including answers to questions like "Where Is Everything on OpenSolaris?"
Kemer Thomson

By Kemer Thomson, Sun BluePrints Editor

An interesting challenge that is presented by permalinks is that, ... well, they tend to be permanent. What do you do when a particular word is embedded in a permalink that is no longer relevant? We just ran into that problem (again) with a blog presented by longtime Sun BluePrints author Mikael Lofstrand. We are forced to make decision to keep the offending permalink, or to break it.

Mikael has been working on a blueprint around an exciting network architecture we have been developing at Sun. We proudly referred to it as a "Sun's TrueScale architecture." "TrueScale" really captured the essence behind it. Only one problem: it looks like QLogic has already trademarked "TrueScale" and in an area that is close enough to present a problem. This resulted in classic panic to find another name that was equally appropriate, but was sufficiently unique. This turns out to be quite a challenge, as it appears that techno-entrepreneurs have been busy locking up any name that looks like they may be able to sell or otherwise leverage.

In the end, my suggestion of "VeriScale" won. All I did was chose the Latin root for "true." We are re-working the blueprint and Mikael edited his blog, substituting in VeriScale. Unfortunately, "truescale" remains embedded in the permalink. We would rather not break a link that (hopefully)  has established itself.

Speaking of what we are not referring to as "Sun's VeriScale architecture," there is now the beginning of a Wiki page devoted to it: http://wikis.sun.com/display/VeriScale. There isn't a lot there just yet, but you might want to check it regularly as we start to populate it with more content.

Mikael LofstrandBy Mikael Lofstrand

The modern datacenter is evolving into the cloud computing model, where networking, platform, storage, and software infrastructure are provisioned as services that can scale up or down on demand. This model allows the datacenter to be viewed as a collection of managed application services that are deployed automatically, while utilizing the underlying services. Providing sufficient elasticity and scalability for the rapidly evolving needs of the datacenter requires these collections of automatically-managed services to scale efficiently, and with essentially no limits. Sun calls this truly scalable approach a VeriScale architecture.

Beyond elasticity and scalability, other requirements from the architecture to help ensure its utility include:

  • Providing a simple means to deploy components with ease so that the operational IT environment is efficient
  • Supporting a rapid development cycle to service new requirements and deploy new services
  • Avoiding reliance on centralized provisioning systems since they inherently have breaking points that limit scalability
  • Consisting of self-contained components that include a fully-operational, self-sufficient software stack with applications, virtualization technologies, storage, and networking for modularity and ease of deployment
  • Enabling services to be self-provisioned, while providing cost-efficient, scalable, and rapid deployments with standard technologies

This blog provides an overview of Sun’s VeriScale architecture as well as its defining principals, and demonstrates how it fulfils these requirements.

VeriScale functional components

TrueScale Architecture

Sun’s VeriScale architecture is comprised of a set of functional components including:

  • The point of delivery (POD), is part of Sun’s scalable dynamic infrastructure suite (DIS) — and represents a self-contained network service stack that includes both software and infrastructure devices. Each POD provides a certain defined functional capacity and can range in size. Each POD is delivered ready to be easily plugged into the network and participate in the system.
  • The base static platform consists of the interconnected nodes and storage devices that are physically wired and remain statically connected. These infrastructure components are virtualized in higher layers that can then be dynamically reconfigured as needed. The components that are used for this layer are defined per required cost, performance, availability, interfaces, etc. The base static platform can be implemented as a flat network where virtual LANs (VLANs) can exist anywhere on the platform or as a point of delivery.
  • The Sun service delivery network (SDN) architecture, provides a set of network connectivity, routing, load balancing, and security mechanisms that combine to form a flexible network infrastructure design framework. This innovative architecture provides high performance, scalability, availability, security, flexibility, and manageability for datacenter infrastructure. The SDN methodology enables a common approach to designing network architectures, and provides a common set of tools that help ensure proper architectural design decisions and trade-offs.
  • The target nodes that are the actual compute hosts that run the applications.
  • The services that encapsulate the business functionality provisioned at the point of delivery. These can include infrastructure services, or applications, and include the networking logic they require.

The functional components are managed through OpenSolaris Dynamic Service Containers (DSC) and are automatically deployed with the networking logic encapsulated into the services components, leveraging the SDN architecture.

VeriScale automation

The VeriScale unit of management is a service — not individual functional components. Networking functions are embedded within services so that services are fully functional, self-contained, self-sufficient, and self-provisioned networked entities.

The traditional approach to automation is to have centralized provisioning systems that deploy individual services onto target systems. This approach limits scalability since regardless of the size of the centralized system, there is always a limit to its capacity and throughput, and it can break if there is a demand peak for its services that outstrips its capacity. In contrast, the VeriScale architecture assigns the deployment logic to the target systems, creating an environment where the targets are self-updating and pull components from the appropriate repositories. This strategy improves scalability since the provisioning workload is handled by each target system and not by a centralized resource. Contrast this approach with the very limited level of network automation in traditional architectures and the true power of VeriScale is immediately apparent.

OpenSolaris Dynamic Service Containers


OpenSolaris DSC is a distributed resource and service management system designed for simplicity and scalability suitable for lightweight applications with simple installation and configuration requirements. OpenSolaris DSC tracks what is going on in the network as a whole, maintains multiple hosts that run the OpenSolaris DSC services and software, and creates and destroys payloads. OpenSolaris DSC implement a pull model that uses resources on the target nodes for provisioning, avoiding the need for a centralized provisioning system with its associated resource bottlenecks. In this model, the resources available for provisioning grow as the number of target nodes increases. The performance limitations then become associated with the target nodes and are not related to their number — only to the load-capacity of each single target node.

The OpenSolaris DSC consists of the following components:

  • The registry is a repository for information that includes the different elements required for the deployment of services across the scalable architecture. In particular the registry includes the necessary information needed to download and install a payload. Other information includes data used to communicate with the service instances to configure them, the network and capacity information for specific hardware platforms, and other administrative and operational data.
  • The repository is a simple, passive storage facility for the payloads.
  • Node controllers locate the registry and continuously query it for changes of desired state, pull initial configuration data and query it for the desired state of service definitions. In addition, node controllers analyze the suitability of a given POD to host a required workload. When a service is provisioned, a node controller offers to host a workload, downloads and installs the needed payload, starts the service, and updates the registry with its state.
  • Payloads maintain information and data needed for the node controller to install and run a workload. Some examples of workloads include an install script, a binary, base configuration, and additional content. The payload can also include a load balancer and its initial configuration.
  • Nodes are aggregates consisting of computing, storage, and communication devices. These aggregates have a pre-defined application capacity that simplifies the resource management model, and allows applications to be dynamically re-provisioned as their needs grow. Nodes are added to an already-provisioned application when its capacity is exhausted, or reassigned when they are no longer needed. Alternatively, the applications can be re-deployed to larger or smaller environments as needed.

Network optimization through distributing networking logic to the PODs

To achieve the flexibility needed for a dynamic, scalable, load balanced environment it is necessary to abstract the destination address by using indirection and resource pooling. Load-balanced networks are normally implemented with a central load balancer that redirects traffic to its destination so that requestors only need to know about the load balancer. This approach results in sub-optimal routing since the data is routed via the load-balancer even if there is a more optimal route to the destination. In addition, if the virtualized services are deployed on a specific hardware platform, it must be replicated to scale.

In contrast, with OpenSolaris DSC based load-balancing, when a client connects to a virtual service through a virtual address, the load-balancer re-directs the request to a server from a resource-pool available for the requested virtual service. Scaling is implemented by adding resources to the pool. The node controller queries the registry for the information it needs. This approach enables the automatic update of the configuration of the resource pool for the load balancer.

Distributed SDC

In the VeriScale Architecture, load balancing can be further optimized if the networking logic is implemented locally in the service’s POD and treated as part of the application. The load-balancing networking logic in the POD scales automatically as payloads are added to scale the services in the POD. The networking logic is embedded in the payload for the applications, which enables the communication to the next destination. (Example: A web server communicates with a locally installed load-balancer, on the same server, which re-directs the traffic to a data store outside the server hardware.) This will provide elastic scale for the networking logic which follows the scaling of the applications. Additional optimization can be achieved if response times are measured, making it possible to optimize the network routing on the fly and determine the optimal path through the network to the target dynamically.

The use of software components that are co-located with applications (e.g., the distribution of the registry and its logic to the POD) help reduce the network traffic and, as a result, reduce the number of physical devices on the network. At the same time, latency can be improved as there are fewer devices, fewer communication hops, and optimized communication paths.

Service delivery network (SDN) architecture and VeriScale automation

Sun’s service delivery network architecture is a domain-specific, modular, and flexible logical model and language for providing a service-oriented view of the network architecture. The SDN architecture provides architectural guidelines, while the actual architecture implemented is designed based upon specific requirements.

The SDN architecture consists of different service domains. Each service domain is a grouping of similar services, called service instances, that include the specific attributes needed for a service to be reachable over the network. For example: a Web server is a service instance in the service domain called Web serving. The Web-serving service domain provides services as a single entity, and is comprised of a subnet in a virtual-LAN that includes multiple Web servers grouped for the purpose of load-balancing.

Each service domain is addressable by clients that connect to the service instance and not to a specific server. The service domains are distinguished by their different characteristic — protocols, ports, health monitoring implementation, and security requirements. A separate management network is included in every instance of an SDN architecture. The management network enforces the security requirements of the service domains. A management domain can manage many service domains, while a service domain has only one management domain. At the same time, the modular nature of the SDN architecture enables the addition of security modules anywhere in the architecture.

In the VeriScale architecture, the SDN provides the glue between the applications and network, and is essential to achieving network automation. In this context, the VeriScale architecture defines service domains that are grouped into service modules, while a service module is a collection of service domains that have a specific purpose — normally an application composed of a collection of software components. These service modules are designed to be easily replicated and distributed, allowing the application to scale on demand.

Two practical illustrations

The advantages of the VeriScale architecture can be illustrated by describing Web server deployment and load-balancing in a traditional versus a VeriScale datacenter.

Deploying Web servers

In the traditional datacenter, when a Web server is deployed, a provisioning server connects to the target node, uploads the Web server application, performs the installation, and uploads content. Manual intervention or the use of difficult to maintain semi-automatic scripts are commonly needed to configure the network appropriately for the added service. Clearly, if there is a sudden need for the allocation of several dozen, or for that matter, several thousand Web servers, the dependence of a traditional datacenter on centralized resources means that it would struggle to service such a request.

In the VeriScale architecture, the act of allocating the POD to the Web serving service domain, causes the POD to pull the appropriate Web server payload from the repository. For this purpose, the POD uses its own local registry that is pre-configured and automatically maintained, to determine all the necessary network information it needs. In this way, the POD can provision itself without the use of centralized, potentially scarce resources nor need manual intervention. Assuming the PODs are available and activated, servicing the request for thousands of new Web servers would be as simple as servicing the request for a single Web server.

Load balancing

In the traditional datacenter, dedicated load-balancing devices distribute requests from a client to a server in a pool of resources. The load-balancers publish a virtual service with one unique, virtual IP address and all requests are directed to it, while each server in the resource-pool handles the requests forwarded to it by the load balancers. Unfortunately, the stand-alone load-balancing and topology control functionality suffers from the bottleneck created by the load-balancers, limiting the capacity of the network services, and requiring the periodical addition of dedicated load-balancers. In the VeriScale architecture, the load balancing functionality is implemented in each POD and may be distributed onto every server. As a result, the topology control and load distribution are decentralized, do not require dedicated devices, and their capacity does not need to be managed.

Conclusion

From the primary requirement of the VeriScale architecture — scalability — follows its primary characteristic — POD self sufficiency. Put another way, each service is encapsulated within a payload with the full range of capabilities required. Once delivered to any suitably-capable POD, these payloads — applications, application platforms, or entire virtual machines (VMs) — can configure themselves and provide the useful function they were designed to provide with no support from central resources. These capabilities help enable the creation of elastic service domains that can rapidly scale up or down on demand, limited only by the availability of hardware resources.

Marshall ChoyBy Marshall Choy, Systems BU

This week we announce the next evolution of Sun's Open Network Systems solution architecture, get ready, it's a long one: "Sun Open Network Systems Enterprise 2.0 Solution for Oracle". When we first developed the architecture for this solution one of the customer problems we were trying to address was ease and speed of deployment as well ease and speed of adding incremental capacity. In other words, simplifying things for our customers. By integrating the elements of compute, network, and storage into a scalable and flexible solution using Sun Blade 6000 computational building blocks and the integrated SB6000 Virtualized Multi-Fabric 10 GbE Network Express Module and the Sun Storage 7410 for bulk network accessible storage we were taking a selection of unmodified industry-standard building blocks and making sure the customer wouldn't run into any issues. We also validated the compatibility of these HW building blocks with the Solaris OS, and Oracle DB, Oracle middleware, and the OpenSource-based Glassfish Webspace server.

To build and deploy the solution PoC including HW provisioning, installation, configuration, and SW installation and configuration, it took approximately 256 man hours, utilizing 4 staff with different skillsets including solutions engineers, performance engineer, and SW engineer. As a result of this exercise, 192/256 man hours was spent troubleshooting and resolving issues, overall pairing of different elements, and vetting out different architectural design points. Hence, up to 75% of this time is time that a customer will not have to expend deploying this architecture. Additionally, they will require fewer staff to deploy as much of the architectural and engineering expertise will be unnecessary.

In the coming weeks, look for a Sun BluePrints paper that'll talk about much more including the performance of this solution as tested in our labs.

by Dean Halbeisen

The need to integrate Windows platforms and UNIX platforms is definitely not a new challenge for IT professionals.   More and more it is desired for Windows and UNIX platforms to share data in the same files on centralized storage systems.  There are many solutions available to make this possible from simple volume level solutions to complex single sign-on solutions.   All of these solutions are aimed to address the fact that Windows and UNIX platforms use different security structures for their user and group authentication to control access to files and directories.  With OpenSolaris we (Sun) took a very  intuitive  approach to address this challenge by building the CIFS stack directly into the OpenSolaris kernel.  With the CIFS stack built into the kernel, the new features of NFSv4, the new features ZFS and many other  OS enhancements the door was wide open to deliver a seamless, ubiquitous, cross-protocol file sharing system  http://blogs.sun.com/amw/entry/cifs_in_solaris.

Netapp for example controls the authentication at a volume level.   This means that each volume in the storage system has to be configured for an authentication mode, UNIX, NTFS or mixed.  The UNIX and NTFS modes only permit clients to use the authentication mode specific user or group credentials.  In example, if a Windows host attempts to access a file on a volume configured for UNIX authentication, the storage system will map the Windows credentials to the UNIX credential structure if a matching user or group credential exists in the UNIX LDAP or NIS.   The mapping of credentials works in the same fashion for UNIX clients accessing volumes configured for NTFS authentication.  When a volume is configured for a mixed authentication mode the volume can use both NTFS and UNIX credentials on files and directories, but each file or directory can only use one authentication mode at one time.  When you use the mixed authentication mode on a volume you have to maintain documentation as to what files and directories use each authentication mode because if the mode gets changed clients may lose access to the files or directories.  

Solutions that use volume or share level authentication configurations are cumbersome to configure and maintain.  Even when configuration processes can be scripted you would likely have to maintain some sort of documentation to keep track of how each volume or share is configured and maintain special instructions on how to maintain the configuration going forward.  

Many single sign-on solutions will convert UNIX authentication into Windows Active directory authentication by installing host agents on the UNIX platforms that will map the UNIX authentication structures into Windows Active Directory authentication structures.  In single sign-on solutions the storage systems do not use any form of local authentication mapping to control access to files and directories because each client performs it own mapping through host agents.  Some single sign-on solutions centralize the credential mapping by requiring a centralized proprietary name information server that will perform the credential mapping for each host agent on the UNIX clients instead of each host agent querying the UNIX and Windows directory servers independently.  

Single sign-on solutions are very complex, tough to troubleshoot and costly to maintain.  Interoperability is probably the biggest challenge for single sign-on solutions.   In single sign-on solutions you have to make sure that every application, every server, every operating system and every storage device that will use the configuration is compatible with the single sing-on software.

The Identity Mapping feature of the Sun Storage 7000 addresses the challenges of Windows and UNIX file sharing unlike any other solution available.  The Identity Mapping service is configured for the entire appliance from single point in the BUI or can be configured from a single point with the CLI.   The underlying filesystem of the appliance ZFS does not have any restrictions on how authentication structures from Windows or UNIX platforms are used and can both be used seamlessly in any share in the storage system simultaneously.   The Identity Mapping feature stores the user and group mapping in a database on the appliance and only has to be configured one time for each authentication policy or authentication rule.  You can configure the Identity Mapping service to use directory based mapping, user based mapping and ephemeral mapping all simultaneously or independently.  

This solution is likely the highest performing UNIX and Windows file sharing solution available because it has the lowest overhead on the volume/share level and does not require external software.  Interoperability of the Identity Mapping service is a breeze as it communicates with Windows Active Directory, UNIX NIS and UNIX LDAP directly without requiring host agents or proprietary name information servers.  It is cost effective as this feature is included in the initial purchase cost of the appliance and future feature enhancements are included in the freely available appliance software upgrades.  The solution also provides observably like no other solution available.  The Dtrace Analytics feature of the Sun Storage 7000 Series enables you to see inside of your work load and break it down by protocol, share, file, client, latency and transfer size in live or post processed graphs.


By Ron Graham, Systems Technical Marketing


This blog is about how to build a medium to large virtualization system with all the great new technology that we have today. The discussion points will take you from why virtualization, to defining some of the components that I would use, and then putting them all together.

Many years ago when I was working with customers we used to talk about virtualization technology, and most customers were just kicking the tires of this technology. Today, customers are implementing virtualization technology into their data center with great success. Server virtualization projects are everywhere and it's hard to talk to customer's without talking about virtualization. The goal of virtualization is to reduce the amount of hardware and increase the availability of the hardware. Hopefully you kind of get that green feeling when you talk about virtualization and consolidation.

For most IT shops, they have calculated that the average CPU utilization for servers in the data centers is less than 10 percent. This is not desirable for any business. Think about owning a rental car business and only renting the cars 10% of the time. You would stand to make a lot more money if you rented the cars out 70% of the time. On the other hand, if you had a tool that would save you money, you would use that tool a lot more than 10% of the time. Also, by increasing the utilization of systems through consolidation, you would also be saving money on power and cooling. Because you have less machines, your maintenance cost would decrease. Hopefully you get the idea.

Sun has been a leader in virtualization for over 20 years now in one form or another. I say this, because there are a lot of companies that can make two cpu socket virtualization systems. But, find a company that will invest into scaling up to eight sockets for x86 systems, and you have found a company that has the vision to solve difficult problems. This is where I see the true value of Sun and a differentiator. Not only do we design and build smaller two socket systems, but we also have great four and eight socket systems.

Intel has just announced the Nehalem CPU micro-architecture and Sun has designed some very elegant virtualization solutions around this technology. The solution that I will cover is virtualization with a new network platform that Sun has developed on the Sun Blade 6000 chassis. Sun has designed a couple of chips (ASIC) into a Network Express Module (NEM) that provides up to 10 Gb Ethernet to the server modules or blades. This is an inexpensive way to reduce the number of cables by a factor of 10 when compared to 1GB ethernet.

Sun has also come out with a storage technology called the Unified Storage system. This  system has some great features like easy-to-use Dtrace analytics. This features allow users to drill down into their storage system and graphically figure out where problem areas are. Also, it is easy to setup and use. I set up an NFS server with the 7410, and it literally took me a few minutes to setup my storage pool and configure it. After inputting credentials and setup on my virtualization management, I was able to see and use the new storage pool. Truly remarkable!

What a great time to implement virtualization. With the Nehalem architecture we can put up to  twice as many virtual machines on a server than older two socket architectures (based on internal testing). This means running fewer servers and saving money. With the new Sun  Virtualized NEM, we can save on setup, cable aggregation, and there is zero network administration. Storage is a key to virtualization and with our Unified storage System we can easily setup, debug, and manage out storage infrastructure.

So this is how I see this particular solution playing out. Our Sun Blade 6000 chassis can hold 10 server modules. The new server module from Sun based on Intel Nehalem technology is the Sun Blade X6270 sever module. Two cpu sockets with up to 144 GB of RAM and 270 Gbps of IO – sweet. Memory is key in virtualization technologies, most of my customers are more memory bound than cpu bound. Using DDR3 memory with this new architecture allows virtualization engines to really perform on this platform.

Two Sun Blade 6000 Virtualized Multi-Fabric 10GbE NEMs will provide most of the IO throughput and redundancy. This way, each blade has two 10GbE ports and two 1GbE ports. for best practice I recommend the following network connections:

1Gb    - Management
1Gb     - Backup and Recovery
1Gb    - Vmotion
4Gb    - Data
2Gb    - Storage

The above is for each server and depending how much data IO your planning on driving your throughput could vary. But lets use this as a starting point. Best practices also states that some of the above networks should be dedicated. With the Sun blade 6000 we always have the option of putting in two industry standard express modules and configure each server any way we want. For example, we can put in combo cards that have fibre channel and networking, more 10GbE, quad 1GbE, and so on. We have a lot of choices. It all depends on how much money you want to spend and what your current environment consist of.




For arguments sake, lets configure the system with one dual port 1GbE card for each of the 10 servers. This way, each physical server has 2 10GbE and 4 1GbE NICs to handle IO.  The configuration would have full failover and redundancy on all NIC ports and have enough headroom to take on additional load if one should fail. Data would go on one 10 GbE port and storage on the second 10GbE port. The two 10GbE ports could be setup to failover to each other. The rest of the network can be distributed on the 4 GbE ports with Vmotion using it's own dedicated network. Vmotion needs all the bandwidth when moving vritual machines from one server to another.

Looking back at this setup it's not that complicated to setup and manage. From a performance perspective, you should have plenty of bandwidth. I like to look at things to see how balanced the architecture is. Starting from CPUs and memory bandwidth, we have some of the fastest CPUs developed for virtualization, the memory has three channels with an on board memory controller that can run at 1333MT/s, and the Sun Blade X6270 has bigger, faster, and more pipes to move memory, which is key to virtualization performance. With PCI Express generation 2, our IO becomes twice as fast as generation 1. Then with four 10gbE ports and a couple of 1GbE ports, lots of IO bandwidth to communicate with the rest of the world. Like I said, a balanced architecture.

This is blog is only intended to start with the hardware piece and I will later blog about installing virtualization software. Comparing the differences between Vmware, Hyper-V will be interesting to take a look at management and performance on this platform.

By Roger Bitar, Systems Technical Marketing


Introduction


As a follow up to the previous Sysbench benchmark that ran on Solaris UFS, we re-ran the benchmark on Red Hat Enterprise Linux release 5.2 using ext3 filesystem on the same setup and configuration. According to Allan Packer's recommendation for tuning MySQL on Linux, we added the following parameter to the initial MySQL configuration file, my.cnf:


innodb_flush_method = O_DIRECT

This parameter will cause MySQL to bypass the filesystem cache, and avoid double buffering. This is similar to the forcedirectio option with Solaris UFS.


We also used the noop scheduler as it provided the best results:


# echo noop > /sys/block/sde/queue/scheduler

Results on RHEL 5.2


The following TPS (transactions per seconds) results were obtained for read only operations:


The following latency results were obtained, smaller is better:


Conclusion



  1. SSDs demonstrated a significant advantage (up to 8x) for this read-only workload in environments where memory was constrained when using smaller innodb_buffer_pool_size. 

  2. SSDs can achieve around 100% of the performance of almost fully cached DB. This is evident when we used the buffer size of 24GB (or about 90% of the DB). That means that in environments where most I/Os are satisfied from disk, rather than system memory, SSDs should be capable to sustain about the same throughput.

  3. Database transaction latency is much better (14x) when using SSDs compared to HDDs.

  4. The best results with this type of workload are obtained on regular disks along with ample main memory. SSDs come a close second, even when main memory is severely constrained. Throughput is significantly worse when regular disks are combined with insufficient buffer memory.

By Roger Bitar, Systems Technical Marketing

Introduction

As a follow up to the previous Sysbench benchmark that ran on Solaris UFS, we re-ran the benchmark using ZFS filesystem on the same setup and configuration.  Solaris ZFS does not allow the forcedirectio option as with UFS. We followed the “ZFS Best Practices Guide” recommendations. Namely, we limited the size of the Adaptive Replacement Cache (ARC) to 1GB, and we set the ZFS recordsize to 16K as it matches the Innodb page size. We used the same MySQL configuration file “my.cnf” that was used in the previous UFS test.

Results on Solaris ZFS

The following TPS (transactions per seconds) results were obtained for read only operations:

The following latency results were obtained, smaller is better:


Conclusion



  1. We obtained better results when we used UFS, because of the forcedirectio option used when mounting the filesystem. ZFS does not have this option, instead we limited the ARC cache to 1GB in the /etc/system file.

  2. Again, SSDs demonstrated a significant advantage (up to 7x) for this read-only workload in environments where memory was constrained when using smaller innodb_buffer_pool_size.

  3. Database transaction latency is much better (30x) when using SSDs compared to HDDs.

  4. The best results with this type of workload are obtained on regular disks along with ample main memory. SSDs come a close second, even when main memory is severely constrained. Throughput is significantly worse when regular disks are combined with insufficient buffer memory. 


By Yan Fisher, Systems Technical Marketing

Web2.0 data centers are typically filled with racks of x64 servers: single apps per box--an architectural convenience, but a decision that leads to inefficiencies of utilization, power, and space. Sun has been working this challenge with a unique and effective approach for several years: chip multithreading, or CMT. First introduced with our UltraSPARC CMT-based systems over three years ago, we have consistently demonstrated that architectures with multiple cores, each supporting multiple threads at the hardware level, can introduce significant efficiencies for some application environments. Web 2.0 turns out to be one such example.

Last month our Performance and Applications Engineering (PAE) team at Sun compared a single Sun SPARC Enterprise T5120 server (UltraSPARC T2 processors @ 1.4 GHz) against eight lightly loaded (about 40% CPU utilization) Sun Fire V20z systems (AMD Opteron Model 248 processors @ 2.2 GHz). While both configs were driving 2,400 users they presented several  interesting points of comparison:


  • System density: 1 rack unit versus 8 rack units

  • Chip density: 1 chip versus 16

  • Core density: 8 versus 16

  • Thread density: 64 versus 16

In other words, the Sun SPARC Enterprise T5120 server occupies 1/8th the space, has  half the number of CPU cores, but 4 times the number of threads.  Measured CPU utilization on CMT server was about 65% in this case.

Testing was done using the Olio web2.0 benchmark. Specifics include:


  • 64 GB memory

  • Operating System: Solaris 10 5/08

  • Coolstack 1.3.1 software: PHP, MySQL, Apache, Memcached, Tomcat

  • Faban benchmark driver v0.9

  • Web2.0 benchmark kit - 082108

The results  were pretty dramatic: the same user load was reached with one-eighth the rack footprint. Furthermore, the Sun SPARC Enterprise T5120 used 0.20 watt per user, versus 1.163 watt per user for the Sun Fire V20z, or less than 18% of the power! In fact, further testing showed that the Sun SPARC Enterprise T5120 server could achieve 3,200 users with about 95% CPU utilization. This represents a remarkable 10 to 1 consolidation . We have seen similarly dramatic results with our UltraSPARC CMT-based systems with other applications elsewhere at Sun and customer's sites. For example, you might want to take a look at the blueprint Tuning Symantec Brightmail AntiSpam on UltraSPARC T1 and T2 Processor-Powered Servers. CMT seems to be ideally suited to many Web2.0-like activities, a perfect match for consolidation. 

Nick Kloski, Systems Technical Marketing



Today, Sun is launching an initiative centered around building awareness of MySQL on Sun systems.  Part of that announcement is a project called, in various forms, “the Web2Kit”, also known as “Olio.”  When you peruse Sun's website, you will see reference to Sun Web2Kit, while if you go to Apache's site, you will see a similar kit named “Olio.”  Before getting into the nitty gritty details about how to set up Olio, I wanted to make clear how those two terms differ.

To start off, allow me to define what Olio is.  The Performance and Applications Engineering team in Sun has developed a framework that allows you to load a complete Web2.0 architecture onto one or more systems that let you (mostly developers I am talking to here) play around with various web technologies to see how the other side works.  Here are the components in Olio proper:



  • webserver (Apache, Sun WebServer, or any other server you desire to use)

  • MySQL as the backend database

  • Glassfish / Mongrel as the Application server

  • Front-end Applications written in Java, PHP, and Ruby on Rails


In a fully running Olio setup, you basically have a set of application that mimics the common functionality of a social media platform (presentation layer for a new user logging in, account creation, calendar viewing, addition of a calendar entry, etc.) through three different mediums (PHP, Java, RoR).  Since each of the interfaces does the same thing, you can compare how, for example, the Java interface differs from the Ruby interface in every level....how the application interact with Apache or Sun's webserver, or lighttpd all the way to how the backend database (which is MySQL or any DB you desire to use as long as you build the table spaces appropriately) reacts to incoming user requests (even in a replicated environment, which Olio is set up to create!).  So, all in all, Olio is a testbed for you to play around with commonly used languages, interacting with a back-end database.

I have described Olio in a few different ways, but it is important to understand that Olio is of interest to developers.  Think about Olio this way:  Olio is a test platform that allows you to compare different application programming languages and tinker with various buttons, within the framework of an open-source and community accepted framework.

You might be thinking that I am describing something of a benchmark, and you would not be far off.  Using Olio on one set of machines will give you a number of how many transactions per second your hardware can accommodate.  Running the same exact software setup on another set of servers will give you another set of numbers.  Poof!  You have a hardware benchmark!  Alternatively, if you desire to make a software benchmark, then out of the box, you will be able to play around with Ruby and Java (and PHP) application schemas and see what types of application software technologies might better suit your needs.

The Web2Kit and Olio – How Sun will help the community

Now that I have given a brief overview of Olio, how does the Web2Kit come into play?  On Sun's website (http://www.sun.com/web2kit) you will not only see Olio, but soon  the various Sun-optimized components in the “Web Tools” stack.  These will be pre-compiled, open source binaries for PHP, MySQL and a whole host of similar technologies to help you install not only Olio, but any common web related frameworks you desire.

The engineers in the Performance and Engineering group will participate in the community to help Olio grow, and answer questions about deployment scenarios and other technical questions.  Use Olio to learn how PHP works, compared to your already existing knowledge on Java, for example.  Olio will help you see how the same application is programmed in other languages.  As Olio matures, Sun will release other components into the stack to help people see how to intelligently use and integrate new pieces into your own deployments (for example, a memcached piece is being worked on for later inclusion).  Some day, if Sun decides that we have the ultimate MySQL Proxy that will revolutionize the way MySQL works, then we will showcase that technology through Olio (NOTE I said “if”, don't believe that this will happen soon, if at all, I was just picking something fanciful to make a point :)

My next blog posting will detail the initial steps on how you set up the Olio base components, and how you would go about setting up test runs and finding out how to gather the resulting data.  For more status on the Apache incubation process (within which Olio is currently incubating), refer to: http://incubator.apache.org/olio/


By Roger Bitar, Systems Technical Marketing


Introduction


Sun will soon introduce Solid State Drives (SSDs) to its lineup of systems. SSDs are bound to change the dynamics of the IO subsystem. A traditional 15K rpm disk can do around 150 random IOs a second. However a single SSD should be able to do up to 30,000 random reads per second and consume a maximum of 3W, as rated by the manufacturer. In addition SSDs can provide faster access to the data (in the order of microseconds), while traditional hard disk drives (HDDs) have access time thousands time slower (in the order of milliseconds).


Workload


We choose to test MySQL, the open source database (DB), using a simple MySQL benchmark called Sysbench. We populated the Sysbench table with 114 Million rows (around 27GB size) that fit on 1 SSD drive . We executed read-only queries while varying the buffer size. We mounted the file system in DIRECTIO mode to disable file system caching. We performed the tests with regular HDDs and repeated them with SSDs.

We were interested in measuring the performance while varying the size of the cache available to the MySQL DB. The following innodb_buffer_pool_size values were used: 8GB, 16GB, and 24GB.


Hardware


For the MySQL DB server, we used a SunFire X4150 system populated with two quad-core Intel X5355 Xeon processors, running at 2.66GHz. The system also was populated with 32GB of RAM and 4 disk drives, one of which was a 30GB SSD.

The Sysbench benchmark ran on SunFire X4440 system equipped with 4 quad-cores AMD 8356 Opteron processors, and 16GB RAM.


Software


We used OpenSolaris 2008.05 OS, MySQL 5.1.28 (64-bit release), and Sysbench 0.4.8.


Tuning


We followed the guidelines posted in Neelakanth Nadgir's blog. The following parameters were used in the MySQL configuration file my.cnf:




sort_buffer_size = 32768

table_open_cache = 2048

innodb_buffer_pool_size = 8192M

innodb_additional_mem_pool_size = 20M

innodb_log_file_size = 400M 

innodb_flush_log_at_trx_commit= 1

innodb_thread_concurrency = 0

innodb_log_buffer_size = 64M



Results on Solaris UFS


The following transactions per second (TPS) results were obtained for read only operations when mounting the file system with forcedirectio option:



 The following latency results were obtained, smaller is better: 




Conclusion



  1. 1.SSDs demonstrated a significant advantage (up to 7.25x) for this read-only workload in environments where memory was constrained when using smaller innodb_buffer_pool_size.

  2. SSDs can achieve around 95% of the performance of almost fully cached DB. This is evident when we used the buffer size of 24GB (or about 90% of the DB). That means that in environments where most I/Os are satisfied from disk, rather than system memory, SSDs should be capable to sustain about the same throughput.

  3. Database transaction latency is much better (65x) when using SSDs compared to HDDs.

  4. The best results with this type of workload are obtained on regular disks along with ample main memory. SSDs come a close second, even when main memory is severely constrained. Throughput is significantly worse when regular disks are combined with insufficient buffer memory.

Jacques Bessoudo, Systems Technical Marketing

There has been much confusion over what power calculators are and I have run into differing opinions about what they should be or should do. Some expect power calculators to be datacenter planning tools, others expect them to be a guideline of how much power a system will consume.

Both are valid expectations, but when using a specific tool, no matter which vendor, it is of the utmost importance to understand what it provides: this should be specified by the vendor that publishes the calculator in order to disclose the intention of the calculator.


What They Are

Power calculators are tools that provide information about the power consumption of a system. They come in many different forms and each vendor has its own way of providing the data. Until recently, there was no industry standard benchmark for power consumption, so vendors did what they thought was right in order to provide useful information to customers - of course, most vendors didn't agree on what was useful to the customer, so most power calculators available on websites provide data that can't be compared across vendors because they are based on different assumptions and workloads. Below is a summary that indicates where these power calculators can be found and what they use to obtain the data.



Dell
     * Workload: SPECjbb, SPEChpc
     * Where to find it: Online or downloadable calculator
HP blades
     * Workload: Undisclosed - see article
     * Where to find it: Online or downloadable calculator
HP rack
     * Workload: Undisclosed - see article
     * Where to find it: Downloadable calculator
IBM
     * Workload: Prime95
     * Where to find it: Downloadable calculator
Sun
     * Workload: SPECjbb
     * Where to find it: Online calculators




Ideally, the data in the power calculators is obtained from running a workload in a real system and measured at the power inlet - between the wall and the power supplies. It is not modeled or inferred from product specifications, but measured data using instrumentation and real systems.



What They are NOT

Power calculators, unless specifically stated, are not datacenter planning tools. The reason for this is that not all workloads have the same behavior. The power calculator will provide a specific reference, but if the workload is any different, then the power consumption characteristics will change. Even environment variables like the room's air density, humidity and temperature will affect the results of these measurements.

System documentation typically includes a datacenter planning guide that specifies what needs to be provided for the correct operation of the system. In most cases, the data included in that document can also be found in the power supply or system label; if there is no datacenter planning guide, this label may come in handy to determine how much power a system is likely to draw as a maximum, since power supplies are designed specifically for the systems they will be energizing.


What They Provide

Systems vendors (Dell, HP, IBM and Sun) provide in their calculators at least two data points - idle power consumption and max power consumption. Max power consumption is specific to the workload run on the platform. Idle is typically the power consumption of a server that is running an OS at the login prompt or that has been logged in with no workload running.

Some vendors provide additional information, like:

    * More than one workload, such as SPECjbb and/or SPEChpc and/or Linpack on specific products
    * BTU's/hr to help calculate the air conditioning that will be required for that workload (this can be manually calculated by multiplying Watts by 3.41). BTUs/hr=3.41*Watts
    * Datacenter power requirements, which is typically higher than 'calculated power consumption'
    * A slider or a % input box that allows the user to 'fine tune' the output of the calculator based on the expected utilization of the server
    * The ability to configure a rack with the hardware and obtain total rack parameters
    * Suggestions on the type of Power Distribution Units that will be necessary for the configurations in the rack

Things to be Aware of



Power calculator result comparisons

It is very tempting to take numbers from one vendor's power calculator and compare against another vendor's power calculator, but they don't work that way because systems are not running the same workload in the same environment. The closest you can get to compare data between calculators is to compare idle power and even then, the numbers aren't 100% apples-to-apples, because even the operating system running on these servers might not be the same.


On redundant power supplies

Power supplies in an enterprise class system can be configured in redundant mode for higher availability to prevent the system from shutting down in case of a power supply failure or a power grid failure - if the datacenter is equipped with redundant power grids.

Redundancy is very beneficial for the uptime of the server, but not always ideal for the power efficiency, because of the power curves of power supplies. Some power supplies don't hit reasonable power efficiencies until a high load is demanded from them; at Sun, high efficiency power supplies are prolific in the product line.

Not all power supplies are equal and each has it's own load / efficiency curve. Ideally, a power supply will reach a reasonable efficiency, say 80%, under reasonably small loads, say 20-30%. From this point on, as the load increases, efficiency should stay above 80% and ideally reach or surpass 90%.

This is a very important aspect to consider: a fully loaded system running all components at maximum will only achieve 50% of the power supply capacity because the load is balanced between the two power supplies. Power supplies that are not efficient from a small load will waste a lot more energy when they are idle, as the load on each of the power supplies might be as low as 15-30% of the total capacity.

Idle systems are inherently the most inefficient systems, because they are wasting energy without producing any work. The higher the utilization of a system, the more work it produces and the better the efficiency in all aspects - electrically efficient by design and [work]/watt whether 'work' is queries per minute or web transactions.


Measuring power on your own

The best way to figure the power consumption of a system is to run it in its actual environment and measure the power it consumes while running the typical application. In order to do this, the internal sensors of the servers can be of great use, since they provide power consumption information with reasonable accuracy. The data from these sensors is usually available from the ILOM remote management interface (depending on the system) under the System Monitoring > Power Management tab.

For rackmount servers, a simple meter like the ones available from Watt's-up can be very useful; their products include simple meters that provide the information on an LCD display, as well as more sophisticated ones that have serial ports for power monitoring, or even a web interface.

Yet another way to do this is to use the Real Time Power Monitoring and Management Service that was recently released for a limited set of products. It requires no hardware and only a small lightweight package to be installed on the servers to be monitored.


James Hsieh, OPL Engineering


In my many years of working with Sun equipment (I started off with an old Motorola-based Sun 2/50 server), each new generation of systems never ceases to impress with how much more functionality we can cram into ever smaller and smaller packages.  It's not just speed and performance (though that certainly is important) – it's things like domaining, fault management, and administration of system resources.

Take for example Sun's landmark E10k server, introduced just over a decade ago in 1997.  It was Sun's first example of a system with the ability to divide system resources up into hardware domains, and use an external service processor to manage resources and provide advanced hardware diagnostics in the event of a hardware failure.  It also took up a fairly hefty footprint in terms of space, power, and cooling (not to mention you could hear the fans for quite some distance).   

Just a decade later, we are now putting many of those same hardware and domaining capabilities into servers that are as small as 6 RU (rack units) in space.  This is the new Sun SPARC Enterprise M4000/M5000/M8000/M9000 server line.  These servers are a great achievement in technology.  It means we can bring these capabilities to people who never were able to have them because of space, size (and of course, cost).

These capabilities, however, have always caused a bit more complexity when it comes to configuration.  Because you can carve up the hardware resources in different ways between different domains, you've needed to follow rules with regards to placement of things like CPUs, memory, and IO.  With the new capabilities in the smaller packages, we now have people who have never been presented with a system with such flexibility – and configuration rules.

A big part of my role at Sun, working for Sun's Systems Group, is understanding what issues are being faced working with the Sun SPARC Enterprise servers.  And I've seen plenty of people struggling with the flexibility in configuration that these servers provide.  So I am working (along with a few friends and the crack Sun BluePrints technical writing staff) to publish a Sun BluePrints article to help take some of the challenges out of the configuration of the Sun SPARC Enterprise M4000/M5000/M8000/M9000 servers.  We're well along in the task, and we hope to have something out for your reading pleasure soon.  Keep checking back with us!

--James

Nick Kloski, Systems Technical Marketing


Last time I talked about toys, and how Sun can—if properly guided—rock the world.  Let's talk about that a bit more, shall we?

I love philanthropism, volunteer for local charities, and would love to work globally on problems that face the world.  When not working on various pressing Sun projects or hanging out in fun locales throughout San Francisco, I really enjoy brainstorming on how Technology can help “bridge the digital divide”.

What I like doing is taking 1) an existing technology and 2) shoe-horning that technology into areas not originally intended. I then try to work out all the nitpicky little details and ponder why no one else has thought of trying that combination before.

Let's try it!  Let's take a technology and a concept and see where it goes:


You will notice that the second link goes to an intentionally purposeful description of what Web 3.0 is.  That's the trouble, no one really knows.  In my own mind, future web technologies that extend beyond what there is today will be defined by one main thing:  increasing seamless integration with one's social life.

In this sense, social does not mean “outside of work,” but more in the sense of Sociology, as in the social layer everyone weaves around themselves while they go about their day.  As an oft quoted ex-CEO of Sun   has said (in varying forms over the years): The future of computing is an increasing trend to technology being ubiquitous.  The web technologies of the future will be so integrated into our lives we will not need to think about patching, software installation, compatibility, costly licensing, or anything else than just getting done what we need to get done.

Let's get back to my example.  How can ubiquitous computing help just one person?  Seems easy to start there.  Seems also easy to map out that one person's technology use per day.  I will submit myself as an example, lest I be charged with creating an entirely made-up scenario just to prove my point.  (Which I would never do...)



  • My day starts off with checking my email from my home desktop while I wake up.

  • Since I often work from places “other” than the office, if I want to get out from the house, I go to a local internet cafe, and check my email on my laptop.

  • Then, off to the office for a little while for a meeting.

  • Then. back home for a bit, then over to visit a friend in San Francisco, maybe going out to dinner, maybe a social event, maybe other fun activities  using my cell phone to find local venues , make calls, and keep up with email.



The SunRay World


I mentioned Sun Ray technology before; now, let's bring it home. If I found a Sun Ray at each of the places I visited throughout the day, I would not need to use any sort of specialized device to do any of the activities above, except for making calls and possibly a portable calendar.


The allure of thin computing is the ability to save your session from one device to another across large geographical distances.  Take your smart card out from your home device, plug it in at work and all of your data comes up, ready to go from where you left off, even in the middle of video streaming!

All on clients that in some cases take up only 5 Watts of power.

I would love to envision a world where thin clients exist everywhere we travel, and wonder if people would call that Web 2.0 or Web 3.0?  

I would love to hear from you how, with all issues aside, you would think “thin computing” would change the world if it became pervasive.  Yes we can use it in targeted call center deployments, or in classrooms in a school, but think larger!....How would the world be changed if these devices were everywhere?  

And better yet, how would you start that thin revolution?

-Nick

Kemer Thomson, Sun BluePrints Editor

Observant visitors to the Sun BluePrints site might have noticed that we recently changed the banner from the "Sun BluePrints Wiki" to the "Sun BluePrints Community"!  Just where is the "community," you may ask?  It is all around us, consisting of thousands of readers who regularly visit the site. Now, what we really need is to get more conversations going: this is the essence of "community"! Have you noticed the little message at the bottom of each page on our web site: I'm going to do my Jedi Knight mind-trick right now: I'm waving my hands every so subtly, urging you, "Sign up! Log in! Tell us what you think!"

A challenge to creating open dialogs these days is that of weeding out those obnoxious opportunists who view the Internet as one big, empty wall for digital graffiti. In 2005  the Los Angeles Times made a bold move with their "wikitorial" and was forced to shut it down in hours, due to the flood of inappropriate material. (http://www.guardian.co.uk/technology/2005/jun/22/media.pressandpublishing) We all learned a lesson from that episode, along with similar experiences that resulted from unfettered  (and anonymous) access in those early days of "social networking." There are a couple of ways of addressing this: 1) make it difficult to automate SPAM generation by forcing the poster to provide some kind of non-automatable response (think "those annoying character strings that take me three times to get right"), or to require some form of registration. The latter seems to dominate, and is especially convenient because most browsers will store your account information and relieve you of the tedium of remembering your login.

Indeed, registration can serve as the basis for benefits, mostly based on opportunities for customization. Sun now has a unified registration system, so one has to register only once to take advantage of full access to many resources, such as wikis.sun.com, forums.sun.com, along with the ability to sign up for useful newsletters (don't miss the "My Sun Connection" on the main sun.com page!) Having registered once, when you log in, you will notice a subtle transformation in the Sun BluePrints Community pages, primarily at the bottom of the page: you will see that you can actually edit the labels (which can benefit other vistors) and (this is what I'm really building up to) comment.

Why not take the opportunity to comment, to start a discussion?  What did you like about Dominic Kay's Configuring Sun Storage J4000 Arrays and the ZFS File System in Ten Minutes?  (Web statistics indicated that many found it to be of great interest, one of our "hottest" blueprints in recent months.)  Do you agree with the positions in Sun's Approach To Intelligent Power Monitoring, or did we miss something important?  We would like to see such discussions, and we will update blueprints based on the dialog, making them even better, more valuable.

Larry McIntosh, Systems Technical Marketing


One size doesn't fit all today—at least when it comes to data access within the datacenter.  As much as we would like for this to be true, there are a number of issues that challenge us all regarding this.  Overall, we must really look at many different things, such as scaling requirements, retention of information, keeping data safe, file system availability, and sharing data in heterogeneous environments.

Today's datacenter provides services for data-intensive applications that can run on different machine types, connected through multiple networks.  Using Sun's Magnum Infiniband technology, Constellation System Blades, Sun's StorageTek Offerings—and of course Sun's Sun Fire X4500 Series Storage Servers—we can now provide solutions that service very disparate networks and platforms.  We can also help to focus on the care and feeding of data to assure that data is kept soundly and can represent various instances of the data for recall purposes.  This can be achieved, given a combination of architecture and technology within Sun's file systems, and deployed in conjunction with an appropriate business continuance model.

Various combinations of software and hardware can address different problems, depending upon requirements for performance and availability, not to mention the budgets and goals an organization has regarding their data. The following picture summarizes five file systems from Sun Microsystems that provide a great deal of flexibility in building solutions. (I am aware that a “file system purist” may have difficulty in classifying all of these as “file systems”, but—in the spirit of this discussion—I also would like to suggest that “file system” can also refer the software service that provides access to data!) Let me try to put some perspective on these.


Solaris ZFS
ZFS can be used for excellent local I/O within a Solaris-based platform.  ZFS's architecture can support very large amounts of storage.  It is extremely simple to administrate and is focuses well on data integrity.  Every block is check-summed to prevent silent data corruption.  ZFS's data is self-healing in mirrored configurations.  If one copy is damaged, ZFS detects it and uses another copy to repair it.  ZFS has fast software-based RAID, using a new model called RAID-Z that is similar to RAID-5. Unlike RAID-5, it uses variable stripe width that eliminates stripe corruption that can occur due to loss of power between data and parity updates.  The file system also implements dynamic disk scrubbing to enhance reliability by reading the data to detect latent errors while they are still correctable.  This dynamic activity traverses the ZFS storage pool to read all data and verify it against its 256-bit checksum.  If necessary, ZFS repairs the data as it finds it.  All of this is happening under the covers while the file system is up and actively servicing clientele.

ZFS's file system for storage servers have been implemented as both direct-attached and SAN-based architectures.  We have deployed ZFS at a number of customer sites across industry sectors and it has been very successful with streaming video and mail stores.


ZFS has great design for latency.  There is a separate ZFS Intent Log (ZIL) that also provides for further File System Stability.  In addition, ZFS has an Adaptive Replacement Cache (ARC), where it keeps pages in memory to improve performance for ZFS buffering.


NFS
We are all pretty familiar with NFS as a vehicle for file sharing of data across a network, since it has been around for forever (“forever” in computer technology terms, that is...)  Data can be accessed through ZFS Solaris-based servers via NFS for both Linux and Solaris clients.  Combining ZFS with NFS is very successful where users are already happy with the performance of NFS.  In combination with NFS, ZFS can support client counts just as other NFS/NAS type implementations can support, but with the added bonus of all the underlying data integrity and care of data one acquires by utilizing ZFS as a local file system store on the NFS Server.


Sun StorageTek QFS
Sun StorageTek QFS is a SAN-based shared file system.  It can service hundreds of clients with petabytes of storage.  It is very fast and I have personally experienced near “wire speed” with large implementations across the globe.  That said, the associated block storage architecture of shared storage LUNs is challenged in supporting both increased SCSI command queue depths and increased session counts per individual HBA. This is really a LUN block storage issue with shared SAN access storage arrays. Each Fibre Channel HBA on the storage array eventually can not scale any further beyond around 128 nodes.  So, one must size this correctly to be successful anytime one utilizes SAN architecture for file sharing with any file system – not only QFS.  There is also a feature one can add to support the sharing of heterogeneous implementations of SAN attached systems discussed below.  QFS has been very successful at supporting heterogeneous clients where there were problems scaling NFS as a single mount point.  QFS has supported these implementations at much higher speed for data throughput and sized correctly works very well.


Sun SAM
Sun has another powerful solution with the Sun SAM file system, which is fully integrated with QFS.  In fact, folks will often refer to this as “SAM-QFS.”  SAM is used for data backup and archiving.  One can devise clear policies that can be very granular for hierarchical storage management (HSM) to ensure copies and versions of data are kept either online or on tape.  Data can be staged in and out of given storage pools, based upon levels of data access one would like to manage and control.  SAM is also be able to extend itself to other file systems beyond what I have described.  For example, I know of one case I have worked on with IBM GPFS, in which SAM was the HSM of choice, providing long term care of GPFS Data via backup/archive through SAM.  We have built a production environment that takes advantage of this: data is staged in-and-out of GPFS and to-and-from SAM-QFS for backup and archive data care.  There is heterogeneous support of data  accessed by Linux clusters, IBM Power Systems, as well as Sun Systems.  This is accomplished by copying data into SAM from the GPFS File Storage pools.  Once in the SAM storage pool, the data is kept based upon policies of the HSM.  Data sharing can also occur between QFS and the IBM systems via IBM's Tivoli SANergy software, supporting QFS for even a more direct data access path.


The Lustre File System
Lustre is another file services provided by Sun.  It has strength in networking with the networking protocols it utilizes for access to data across many different types of interconnects, such as Quadrics, Ethernet, Myrinet, and Infiniband.  The more common forms of this type of data access is either Ethernet or Infiniband today.  Lustre also has strengths in scaling very well and has been successful at servicing tens of thousands of clients concurrently.  It utilizes an object based storage method to stripe data across object storage servers for very fast highly intensive I/O for Linux based clusters today.  It can service many Petabytes of data and this service grows continuously.  So one can focus on the use of Lustre where NFS does not scale well with Linux based clusters.  Traditionally HPC clusters have requirements for Data Bandwidth which can be very demanding on up to hundreds of Gigabytes of data throughput per second.  Lustre can service clientele very well in this area.  There are failover scenarios one can implement to assure data can be accessed should an object storage server go down or meta data services are impacted for Lustre to provide continuous access to data.  We have been very successful at implementing Lustre with Sun Fire X4500 Series Servers through Infiniband HCAs and drivers.


In summary, one can utilize a simple implementation with ZFS today along with NFS for file sharing on smaller sized client counts and be extremely successful with overall data integrity and stability that ZFS offers.

In addition, QFS offers a good mid sized offering with fully integrated SAM features that were discussed.  On another note, QFS also has a Sun clustering feature which also supports business continuance operations requiring high availability of data.


Finally, Lustre offers scaling to the top of the most highest required client counts and bandwidth requirements as well as scalable storage offerings.  Sun has also implemented Lustre on smaller sized and mid sized clusters with great success so don't just get the idea the Lustre is only for the very large deployments.  I have been personally involved with customer accepted systems based upon customer verified data write performance with RAID 5 implemented on systems ranging in size from 6 Sun Fire x4500s on up to 72 Sun Fire x4500s that scale very well with Lustre.

The Texas Advanced Computer Center (TACC): http://www.tacc.utexas.edu/ and Sun deployed the aforementioned 72 Sun Fire x4500s along with Sun's Constellation System and Infiniband technologies for TACC's Ranger System which services the National Science Foundation Researchers data intensive computing requirements.  Further details of TACC's Ranger system can be found here: http://www.tacc.utexas.edu/resources/hpcsystems/#constellation

So, that's fine, but Sun and TACC also combined the services of Lustre with SAM's ability to perform backups and archiving while we deployed the Sun Constellation TACC Ranger System.  Through the use of Data Movers we were able to have both File Systems deployed working in unison with one another servicing the NSF Researchers.

Once again we are staging data into and out of Lustre from SAM for the data care aspects associated with business continuance.

So, even though one size doesn't fit all, there really is more than one way to peel this onion as we successfully have shown via TACC as well as other deployments we have done.

Sun has had success in implementing combinations of these file systems to date in meeting differing demands.  We have done this as described with the combination of ZFS and NFS.  When one needs very high I/O BW at Gigabytes per second for Infiniband access to data for Linux clusters I have implemented Lustre.  (It is worth mention that there is also a Linux client for QFS.)  When one combines this Linux client for QFS on a Linux Lustre client, one has a powerful data mover that can be used between file systems.  In addition, there are implementations that use both file systems through such software as GridFTP, a high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks.  This also has been implemented in conjunction with Lustre to provide shared access to a SAM's HSM infrastructure.  Data can be copied from Lustre into and out of SAM, where once placed, the automated policy engines of SAM kick in for business continuance and care of data more long term.  So you see, combinations of these file systems can provide for extreme care of data.

On a final note, Sun is working to combine these file system services together.  It has been discussed publicly that Lustre will utilize ZFS as the pure file system of choice, moving forward to have Lustre run on top of ZFS.  Why? To provide all of the relly great features of both ZFS and Lustre, combined together to enhance data access, data integrity, data performance, etc. under a single combined file system.  In addition, similar work is underway to extend the HSM services described herein to that same associated ZFS and Lustre combined offering just described.  Why again? Well as we started this dialog—one size does not fit all—or will it sooner than later?

Until Later –
Cheers -- Larry

       Nick Kloski, Systems Technical Marketing

I am an optimistic person.  My job position in Sun is officially “Solutions Engineer” which is an interesting title only when you consider that everyone else in Silicon Valley has the word “Engineer” in their job description.  (I once saw a badge at a local theme park in the North Bay which had the ride operator's name on it “Suzy” but also had underneath it her title, “Experience Engineer”.  She must know the badge dude.)  So, let's take out the word Engineer.  What am I left with that describes me as a unique and vibrant person within a cool and dynamic place like Sun?  “Solutions.”

Not as bad as one might think actually.  My area in Sun is a great one to be in, mostly because of the passion various people bring to all the varied things going on in this realm:  Web 2.0  My job in Sun is to figure out what Sun does exceptionally well, develop technical solutions to prove that, then tell everyone about my experiences.  I do a little bit of everything from attending product meetings (some more interesting than others), to jumping on a terminal and configuring MySQL, to working with the research group at Sun on crazy future projects.

So here is my operating premise:  Sun does not have what you need....yet.  Because through the work from my group we can design something that addresses your problem.  My area is Web 2.0, let's look at the toys I get to play with to build neat Webby2.0 things:


...And a cool lab to run these things in.

...And these are only a few of the things at my disposal!  

My ongoing challenge is to assume:


Sun has not yet shown you (the customer) the cool things we can do to the best of our ability.  Sun can do better, my team can do better, I can do better.


Sun does not have what you need....yet.

What can Sun do for you, or better, for the world?  Best to start large, in my opinion.

Nick

(I really am technical, I promise!  Future postings, if not guided by your comments will be on various topics I think are interesting....comment on this blog to guide my thoughts on how Sun can do rockingly cool things :)

Pierre Reynes, Systems Technical Marketing



The Sun Fire x64 servers come pre-installed with Solaris 10. However, that does not mean that Linux and Windows users have to deploy their OS the hard way. Sun provides a free tool making OS installation simple and easy. Still installing your Linux/Windows OS the old way? Downloading drivers manually? Using a floppy drive??? Well, here is the good news: It is called SIA!


The Sun Installation Assistant (SIA) is a bootable CD with a graphical interface that simplifies system deployment for x64 Sun Fire and Sun Blade servers. SIA provides a step by step installation wizard for Linux and Windows. It automatically recognizes the hardware platform that it is running on and offers a list of supported SLES/RHEL Linux and Windows Operating Systems to choose from. Since it does not include any OS image, the user must provide the OS media and a valid license. SIA contains and automatically installs the drivers for the detected devices and supported option cards. It can also update itself and download the latest drivers by connecting with Sun. Finally, to make sure that the system is ready for production, SIA can also upgrade the ILOM Service Processor and the system BIOS.


If you are not using SIA and still installing Linux/Windows the hard way, go to your system download page on Sun.com, download the SIA iso image and let SIA do the work for you. Direct links to the download pages are also provided on
the SIA main page on Sun.com: http://www.sun.com/systemmanagement/sia.jsp

Main Installation Steps using SIA



  1. Insert the SIA CD in the system and boot from CD.

  2. If no physical CD in the system, SIA can also be used either over the Service Processor (or CMM with Sun Blade modules) remote KVMS feature, or from a USB Flash drive, or over the network using PXE Boot.

  3. If the system is connected to a network with active Internet gateway, check for latest updates with Remote Update.

  4. Insert the OS to be installed and provide information necessary to regular OS installation (i.e. License key, password, hostname, etc...)

  5. Complete the normal OS installation process

  6. Done!


Sun x64 Systems currently supported by SIA
AMD based rackmount servers:



  • Sun Fire V20z Server

  • Sun Fire V40z Server

  • Sun Fire X4100 Server

  • Sun Fire X4100 M2 Server

  • Sun Fire X4200 Server

  • Sun Fire X4200 M2 Server

  • Sun Fire X4140 Server

  • Sun Fire X4240 Server

  • Sun Fire X4440 Server

  • Sun Fire X4500 Server

  • Sun Fire X4600 Server

  • Sun Fire X4600 M2 Server

Intel based rackmount servers:


  • Sun Fire X4150 Server

  • Sun Fire X4450 Server

AMD based Blade server modules


  • Sun Blade X6220 Server Module

  • Sun Blade X8400 Server Module

  • Sun Blade X8420 Server Module

  • Sun Blade X8440 Server Module

Intel based Blade server modules


  • Sun Blade X6450 Server Module

  • Sun Blade X8450 Server Module

Understanding the Sun xVM Hypervisor Architecture was published yesterday. This is an important blueprint, long in the making! It started over a year ago, when longtime Sun BluePrints author Michael Haines came to me with the idea of creating a series on Xen. Xen morphed into the xVM Hypervisor, which is integral to Open Solaris, and because features, interfaces, and even the name were continually changing, it was impossible to get a steady target for publication. This blueprint is so hot off the press that the bits in the PDF are still hot. Because this important technology is guaranteed to continue evolving, we will strive to keep this document up-to-date.

 

Many will be tempted to jump ahead to Chapter 5, “Advanced Installation and Configuration,” but this document is packed with essential background and information, including a pragmatic discussion of the types of virtualization available and where the Sun xVM hypervisor fits in. To further whet your appetite, I extract the following from the introductory chapter:


This Sun BluePrints article discusses the Sun xVM hypervisor architecture, a new approach to virtualization for x86 and x64 systems that makes it possible to run multiple disparate operating systems and applications on a single server.


  •  “Creating Efficient Datacenters”, discusses approaches to virtualization and introduces the Sun xVM hypervisor software architecture.
  • “Basic Control Domain Verification”, describes how to create paravirtualized domains on x86 and x64 systems using Sun xVM hypervisor software.
  • “Virtual Machine Management”, describes the command line and graphical tools available to manage guest domains.
  •  “Advanced Installation and Configuration”, takes a detailed look at the command line methods and advanced options available for configuring domains.
  • “Migration of Virtual Machine Instances”, explains the live migration process.
  • “Troubleshooting”, provides an overview of topics and techniques that may be useful when diagnosing domain related problems.
  • “Hardware-Assisted Virtualization”, explains how to install unmodified guest operating systems in hardware virtual machines.

Don't miss this important new blueprint! Visit http://www.sun.com/bluerprints and read your free copy.

 I am behind on this blog, just catching up. Although it is not the announcement vehicle for new content—I hope you are visiting the web page to see what is new—I do want to provide a little perspective. After a dry spell, we suddenly have a whole slew of new and interesting documents. In fact, in the last week we have posted 142 pages of very diverse content:

Using Logical Domains and CoolThreads Technology: Improving Scalability and System Utilization, by Ning Sun and Lee Anne Simmons documents a fascinating internal project that examined the application of LDoms to our CMT products to improve scalability and system utilization. It was found that configurations with 6 logical domains exhibited scalable performance improvements, yet still did not fully utilize system resources of the SPARC Enterprise T5220 server. A configuration with 12 logical domains increased the overall throughput by over 50 percent compared to the  6-domain configuration, while almost fully utilizing the available CPU resources assigned to the logical domains.
Using Solaris Cluster and Sun Cluster Geographic Edition is another contribution by longtime Sun BluePrints author Tim Read. There has been lots written about Sun's many virtualization technologies. Tim provides a comprehensive survey of the application of these to the Solaris Cluster software (and its Open High Availability Cluster open source equivalent) with an eye on best practices.
Sun's Reference Architecture for Video Surveillance with ipConfigure ESM provides a brief overview of the opportunity to address the Video Surveillance market. The document presents an architecture built on the foundation of our Sun Fire X4500 Server as the archive server, used in conjunction with ipConfigure's ESM software and a pair of Sun Fire X4100 M2 servers to provide both management and drive a simulation. The system was tested under both server-based and camera-based motion-detection scenarios and demonstrated considerable scalability. This document provides both background and hard data, representing a important integration of our server and storage products.
The Managed Desktop Factory: Sun Virtual Desktop Infrastructure Software as a Service focuses on the application of ITIL methodologies to optimally deploy desktop environments throughout the enterprise via "Managed Desktop Factories" using thin clients, PCs, and even mobile devices.


We have more exciting and interesting content on the way, including a "definitive" blueprint on our Solaris  xVM Hyperisor and a detailed view on Sun's own energy efficient datacenters.

Yesterday a momentous switchover took place: the www.sun.com/blueprints page was redirected to the new Wiki. That may seem like a minor event, but it eliminates a redundant feed and gets everyone reading out of the same book. And, the new book is well worth reading. Although still a work in progress, it now gives us the ability to get content up almost instantly. We have dropped the monthly edition designation and are publishing as quickly as possible.

 

Another enhancement, which I hope is appreciated, is that the summary page for each article has a more complete description of the contents, including the table of contents for longer articles. It is our hope that this will help the busy reader better assess the potential value of the blueprint. Also included on the summary page are two important additional pieces of information: the author biographies and the acknowledgments. Anyone who has tried to write (or practices it regularly) knows that it is a time-consuming avocation. Busy engineers who take time (often personal) to inscribe their best practices so formally demonstrate a special level of commitment: we should honor them. Likewise, those who assist with advice, review and corrections deserve everyone's thanks.

 

What about the future? More content, of course. We have a summer intern starting the end of the month and have great plans for her time:

 

  • We will get the rest of the books posted.
  • In response to a reasonable request, we will make sure articles are clearly designated with their publication date. This is always a potentially important gauge of relevance; the older a publication, the more likely it is dated.
  • We are thinking of adding summary pages for older articles. Right now, we go back through 2004. It is my belief that earlier content is suspect, but it is always difficult to "throw away" technical content that might be of value.

We are also considering new media, such as podcast interviews, to enhance at least some of the blueprints. We are very much interested in your ideas.


 

Today marks something of a milestone for the Sun BluePrints Program: the first truly new content published first on the Sun BluePrints Wiki. This represents a significant change in our process: our destiny is now in our own hands, in which we manage the content entirely ourselves using Sun's collection of channels for Wikis, blogs, forums and posting media. First of all, the notion of "monthly edition" is gone: we post content when it is ready. This is a good thing. Now, when we have a new PDF document, we post it ourselves on mediacast.sun.com, and we are able to assign a permalink to it; it is also easier to update this document as minor corrections are made.

 

There is always a moment of hesitation when one switches to something entirely new. After all, the "old" process worked for nine years. Or, did it? Actually, getting new content and updates posted involved more fuss than sometimes it was worth, and as I have mentioned earlier, our ability to create lists to browse by subject had broken down. No, the old way was dated and limited, and the new way is full of opportunity and power!

 

Meanwhile, let me introduce our latest article: Optimizing Systems to Use Flash Memory as a Hard Drive Replacement, by Om Narasimhan. We are entering an age for which for which flash memory storage devices, while not exactly cheap, provide significant advantages as a systems storage device. This new blueprint addresses this topic for Linux systems, specifically, although concepts apply to Solaris. To take a quote from the article:

When implemented properly, flash devices can boot systems faster and provide higher performance. Flash devices also naturally stay cooler than hard drives and can operate across a wider range of thermal conditions. However, installing an operating system in the default manner on a flash drive may not result in the best device performance or longevity.

A with using flash memory in place of disk is that the longevity of the flash medium is dependent on how often it is written to. So, it stands to reason that one would want to minimize write activity: that is the central topic of this blueprint. There are a number of useful recommendations here. It is a very readable article that will bring you up to speed on a topic that will have growing importance.

 

I would like to add one special note. By definition, Sun BluePrints articles are written and backed by Sun engineers. The originator of much of the material in this article left Sun before publication, and therefore is mentioned in the acknowledgments section: Phillip Martin. I want to thank Phillip for providing Om with such an excellent starting point for this blueprint.

 


Book imageDuring its heyday, the Sun BluePrints Program published a number of books. Producing books was an expensive proposition, requiring a lot of time, resources and budget. People love books--not always for the right reasons. Executive management likes them because they look impressive, even if they only end up as (expensive) doorstops. Engineers love being a named author of a book: what more visible credibility could one ask for? Do they pay for themselves? Books on really hot topics do--they can even make the author (and publisher) wealthy, but there aren't lots of them. Many of the Sun BluePrints books were successful only because we subsidized them.

 

In the "golden days" of our industry preceding the "Dot Com Crash" many technical companies made it effortless for employees to order technical books; I remember at Sun I could order books off of FatBrain, the books arrived quickly and someone was billed, no questions asked. Following the crash, many companies halted similar practices and sales of technical books plummeted. I know the final two Sun BluePrints books sold only on the order of 500 copies. With the loss of staff resources, we stopped producing books. I don't see us as starting up again, but in this business one should never say never.

 

We do own the rights to most of the PDFs of these books and we use to make them available on the free CDs we used to distribute. Some of these books are quite old and out of date, and it is always amazing to find residual interest. We are afraid to throw anything away, just in case. So, it is with great pleasure that I announce that we will start posting these books on the new Sun BluePrints Wiki site. The first posted is for Rob Snevely's book, Enterprise Data Center Design and Methodology. Not only did this book sell pretty well, we bought and gave a lot of copies away because they were good for business. Enterprise data centers are filled with computers--the kinds of things that pay Sun's light bills.

 

Books are being offered strictly as-is. These are pre-production PDFs, so they aren't always the exact same layout, but all of the information is there. By placing them on the Wiki it is my hope that we might get some traffic: comments about the book, even possibly "votes" for update and revival. While I said we are out of the book publishing business, I also observed that we should never say never.

We are working to expand our reach: more content reaching more readers. The first step is the new Wiki format, which not only makes it faster and easier for us to publish, but which encourages more interactivity. The next step will be creating new forms of content, possibly using different channels. The obvious candidate is to publish shorter blueprints directly in Wiki form; readers that want a PDF can have it generated for them via the Confluence engine; it won't have the fancy cover, and the graphics won't be as high a resolution. How about other media forms, such as podcasts? There is a lot of interest at Sun over the many social networking tools out there. How about using Second Life?

 

I'll be getting help. I have several open job requisitions: two for college intern positions, one for a new college graduate. For information, see www.sun.com/studentzone—just search for "kemer" to pull them up. All will be a part of a new team whose charter is not to only explore new media and channels, but to help create the content itself. Pretty cool job, I think: you get to work with some of Sun's best engineers to extract their pearls of wisdom about how to build better solutions.

An important event quietly happened yesterday: the Sun BluePrints Program saw the return of Vicky Hardman as program manager. The Dot Com crash had a ripple effect, painfully reducing the staff and budget dedicated to this program, significantly changing the way we did business. Vicky presided over what I would call the "golden age" of the program, and was very good at keeping things moving along and (especially) not letting them fall through the cracks. Her return heralds a revitalization of this nine year old program: just in time for our new Wiki front-end.

 

If you examine our current "archives", we have content stretching back to  April, 1999; we are about to enter our tenth year! Looking at these older articles is like strolling down memory lane, but technology is moving on, and much of that content is so old that it is not clear what we should do with it. Does anyone care about Solaris 8 any longer? How about PC Netlink 1.0? For this reason, I point only back through 2004 on the new Wiki page. Still, the information pack rat in me can't stand the thought of throwing anything away; perhaps I'll collect it and tag it something like "antiquated." I'm open to ideas.

 

As for other ideas: it is my hope to soon launch a second class of blueprint that will be published directly on the Wiki, i.e., not in PDF. Such articles would still go through our review process and we would  provide some minor editorial polishing, but they won't be the large, formal PDF documents. There are great advantages to this approach, particularly for content that is subject to frequent updating. It's not "either/or": both will exist. For those of you who are fans of BigAdmin, as am I, we are not intending to "compete." Indeed, BigAdmin has pointed to Sun BluePrints articles for years, and I'm going to get more active (now that Vicky is back!) to ensure that our latest articles are properly placed there. An important difference is that the content delivered by the Sun BluePrints Program always has a Sun engineer behind it and always goes through a review process. BigAdmin hosts many "best practice" documents from a wider source, providing a vital service to all.

 

We are entering "a new golden age," as prior principle contributor John Howard (who has moved on) used to joke. John, we will try to do you proud!

A long-standing controversy within the Sun BluePrints Program has been over the term "best practices."  As a superlative, the adjective "best" indicates that something excels everything of its class. This is the kind of language that drives our lawyers nuts: it is nearly impossible to prove and generally prohibitively expensive to even try.

 

Yet, the phrase "best practices" has crept into common usage with a little less rigor, often suggesting something that might be more correctly called "better practices."  Wikipedia has an interesting entry on the term "Best Practice" that mentions that it is used as a buzzword '...to describe the process of developing and following a standard way of doing things that multiple organizations can use for management, policy, and especially software systems.'

 

While we encourage our authors to pursue as much rigor as possible in presenting options and trade offs, including their rationale for selection of one approach over another, sometimes it is just "good enough" to present something that we know works, based on the cumulative experience of  the authors and the reviewers. Somehow that phrases "pretty good practices" or "good enough practices" just don't have the right ring to them. We will continue to use "best practice" to indicate the best approach based on the specific conditions and experience of the author; change any of those and you may find yet another "best" practice that works better for you. I would maintain that the real goal is to provide a pattern that the reader doesn't blindly follow, but rather adapts to their situation.
 

When we created the Sun BluePrints Online web site almost a decade ago, web authoring was a very different enterprise. Cathleen, our multi-purpose program manager/webmaster edited "raw" HTML; I'll bet using vi. The set up was pretty straightforward: there was the main landing page, and a couple of ancillary pages, including a "Browse by Date" and a "Browse by Subject" page. The browse-by-date page was (and still is) relatively straightforward: we just kept adding new article entries to the top. The browse-by-subject page was more problematic, because we created an arbitrary list of subjects, created sections for each subject on one page, then pasted the entry for a new article in each relevant section. This meant, of course, that we were replicating a block of text multiple times, with all of the usual "referential integrity" problems that resulted if you wanted to change anything. Even worse, adding a new subject was an arduous (and error-prone) manual process, forcing us to "re-index" everything by hand.

 

Things got out of hand a couple of years ago when that HTML file exceeded the 1mb limit. Not only was it extremely slow to edit, because of its size, but editing was very error prone. I knew at the time that the state-of-the-art had advanced to largely automate these kinds of things, but our problem was that we were stuck back in the stone age of web authoring. I "solved" the problem by removing the browse-by-subject file altogether; I've consistently received about one email a month from readers asking me to put it back in.

 

With our new Wiki format, the solution is very easy. Every article (so far, I'm going back only to 2004) has its own landing page that not only allows comments, but tags. I've created a starting set of tags and gone through and updated each of these pages. Incidentally, anyone with edit privileges on those pages—which is granted to anyone who creates a login account—can add to and edit those tags. So, if you see something missing or wrong, you can fix it. Communities really are cool...

 

Now for the really cool part: given tags, creating a list is one line of Wiki markup: {contentbylabel:<tag>|key=BluePrints|maxResults=99}.


Now that we have this set up, I'm very interested in what the tags should be. I'm hoping we build enough of a community so that I will start getting recommendations. 

Like many bloggers on blogs.sun.com, I'm guilty of dipping my toe into the water and then either not jumping in, or perhaps jumping in, then right out again. There is lots of excitement at Sun over the numerous web-based communications mediums now available. So many options, so much overlap between them: which should we use?

 

I began using this blog to solve the wrong problem: revitalizing the web page for the Sun BluePrints Program. I'm embarrassed to say that the format of that site hasn't changed since I joined the program seven years ago. Even worse, I discovered recently that there was no direct path from the sun.com front page to us; people might "bump" into it, but if they didn't know about it, nothing was going to lead them there. I thought it might be a good idea to post summaries of new articles on this blog, pointing back to that page, in the hopes that it might "advertise" the Sun BluePrints site. This actually ignored at least two problems:

  • The old web site was still old and ugly...
  • With so many blogs available (perhaps too many...), who is going to pay attention to a monthly "advertisement" for another web site?

I have jumped back into the water; I'm practically skinny dipping. First of all, last weekend I started "re-inventing" the Sun BluePrints home by building a wiki: The Sun BluePrints Wiki. I'm going to discuss that transition more in later blogs, but for now there are several solid reasons why this is the way to go:

  • We can maintain it directly, making updates faster and easier. We plan to make everything much more dynamic.
  • It makes it possible for readers to comment on everything. We want comments, even the ultra-critical ones: this is what community is all about.
  • In addition to the more formally produced PDFs, we want to promote shorter Wiki-centric "bluenotes".
  • We can build an RSS feed (very soon, now), which readers periodically ask for.
  • Last, but not least, we once again have a way of organizing content by category. This has been my biggest complaint over the last several years: why did we eliminate the "browse by subject" page. I'll discuss that more in a later blog, but the miracle of the wiki is that I can tag content, then create lists automatically, based on tag queries. That is the way it should be!

 

I'm hoping that we will "retire" the old site sooner than later, redirecting it to the wiki.

 

As for this blog, I'm also going to "redirect" it, too. Rather than just announcing new content--something that the wiki will do automatically--I thought I would share more from my 20 years at Sun, with a special focus on both best practices that I see and experiences with the Sun BluePrints Program: the premier site for best practices developed and reviewed by Sun engineers.

 

Kemer Thomson