String Internals

How C# string semantics are mapped to std::string in sharp-runtime.

The Core Mapping

C# string (reference type, immutable, UTF-16 encoded) maps to C++ std::string (value type by default, mutable, typically UTF-8 or unspecified encoding). This is the most significant semantic difference in the string story.

PropertyC# stringsharp-runtime std::string
Type categoryReference typeValue type (copyable)
MutabilityImmutableMutable
EncodingUTF-16UTF-8 (by convention)
NullCan be nullCannot be null (empty string is the convention)
Char typechar = UTF-16 code unitchar = 8-bit byte
Lengths.Length = UTF-16 code unitss.size() = bytes
EqualityValue equality by defaultValue equality via ==

Why std::string?

Using std::string rather than a custom class avoids the overhead and complexity of a managed string type. It interoperates naturally with all C++ standard library APIs. The trade-off is that callers must be aware of the encoding difference when dealing with non-ASCII characters.

System::String Role

System::String is a static utility class with a deleted constructor. It provides the .NET-style static API (String.IsNullOrEmpty, String.Format, etc.) but does not hold any string data. All string data lives in std::string instances.

include/System/String.hpp

Null vs Empty

In C#, null and "" are different string values. In sharp-runtime, because std::string cannot be null, the convention is:

std::string name = "";  // "null" string in .NET equivalent
if (System::String::IsNullOrEmpty(name)) {
    // handles both "null" and "" cases
}

The charcs / char16_t Alias

C# char is a 16-bit Unicode value (UTF-16 code unit). The sharp-runtime alias for this is charcs = char16_t, defined in SharpRuntimeHelper.hpp:

using charcs = char16_t;

When porting C# code that uses individual char values, use charcs. When porting code that works with strings as a whole, use std::string.

Note that std::string contains char (8-bit), not charcs. If you need a UTF-16 string, use std::u16string — though most sharp-runtime APIs use std::string.

Encoding

sharp-runtime does not enforce a specific encoding in std::string. By convention, strings are expected to contain UTF-8. The System::Text::Encoding hierarchy provides explicit encode/decode operations:

include/System/Text/Encoding.hpp

StringBuilder

System::Text::StringBuilder wraps a std::string and provides append/insert/remove/replace operations. It mirrors the .NET StringBuilder API for mutable string accumulation:

System::Text::StringBuilder sb;
sb.Append("Hello");
sb.Append(", ");
sb.Append("world");
std::string result = sb.ToString(); // "Hello, world"

include/System/Text/StringBuilder.hpp

String Interning and Identity

.NET has string interning (reference equality for same-content strings). sharp-runtime has no such mechanism — two std::string objects with the same content are equal via == but are distinct objects in memory. This is the expected C++ behavior.