Description
Full name
Dhruv Arvind Singh
University status
Yes
University name
Indian Institute Of Information Technology, Nagpur
University program
Computer Science Enginneering
Expected graduation
2027
Short biography
I’m a second-year Computer Science and Engineering student at the Indian
Institute of Information Technology, Nagpur. My fascination with technology began in
high school when I started experimenting with mobile app development, which
quickly turned into a passion for building things with code.
College opened up new opportunities for me to explore computer science
more systematically. I got hands-on with languages like Python and JavaScript,
gradually developing an interest in backend systems, data modeling, and the logic
behind how real-world applications work. That curiosity has fueled my learning
journey ever since.
Over time, my interest started shifting toward building real-world
applications. That curiosity led me to explore TypeScript and Rust, which introduced
me to the world of web development and backend architecture. I eventually
immersed myself in full-stack development, working with frameworks like React,
FastAPI, and PostgreSQL, while also exploring containerization and CI/CD pipelines.
Lately, I’ve been actively contributing to open-source initiatives that align
with my interests in backend systems and tooling. These experiences have not only
deepened my technical skills but also connected me with a global community of
developers and mentors.
Timezone
Indian Standard Time (GMT+5:30)
Contact details
Platform
Linux
Editor
My preferred code editor is VSCode with extensions like ESlint warnings and devcontiners
Programming experience
I was first introduced to programming in high school through Kotlin, which
sparked my curiosity for using code to solve logical problems. That early experience
laid the foundation for my deeper dive into computer science after entering college,
where I explored languages like C and C++ to enhance my algorithmic thinking and
understand computational fundamentals.
As I delved further, I became increasingly interested in building user-facing
applications. That’s when I transitioned into web development, picking up JavaScript
and gradually expanding into the full-stack ecosystem with technologies like
React.js, Next.js, and Express.js. To diversify my backend skill set, I’ve recently been
exploring Rust and its ecosystem.
Throughout this journey, I’ve worked on a variety of projects—from interactive
web apps to CLI-based utilities. One of my favourite projects is Multiplayer Ludo, a
real-time game platform that connects users globally for online matches. The
system is built on a Node.js backend with WebSocket support for live interaction
and uses MySQL for persistent data. Features like custom game rooms and targeted
matchmaking enhance the multiplayer experience significantly.
JavaScript experience
I was first introduced to JavaScript during a college course, and that
moment marked the beginning of my journey into web development. As I
started experimenting with it more, I soon discovered frameworks like React
and Express, which I used to build one of my early full-stack projects—a chat
bot.
That experience sparked a deeper interest in development, eventually
leading me into the world of open-source. Contributing to real-world
projects has significantly strengthened my JavaScript skills and given me
valuable insight into writing maintainable, production-level code.
Node.js experience
Over time, I’ve become quite comfortable working with Node.js,
handling everything from writing scripts and building RESTful APIs to
working with file systems and integrating databases. My experience has
been shaped largely by backend-focused projects, where I frequently used
Express.js to architect and manage server-side logic.
I've also spent a lot of time exploring the npm ecosystem—leveraging
a variety of libraries to streamline development and enhance functionality.
One project I’m particularly proud of is a collaborative whiteboard
application. It features real-time drawing powered by a WebSocket server,
with a Node.js backend and a Next.js frontend. MySQL serves as the primary
database, enabling persistent and synchronized user interactions.
C/Fortran experience
In college, C was the first language I was introduced to, and I’ve
developed a solid understanding of it through solving various competitive
programming problems focused on data structures and algorithms.
As part of my first-semester coursework, I built the string library in C
from scratch. For my third-semester project, I developed a Library
Management System in C++—a terminal-based application that utilizes the
full capabilities of C++ classes and manages data through a .csv file,
including storage and manipulation.
I’ve also built several small projects to further strengthen my
understanding.
While I don’t have much experience with Fortran, I’m always open to
exploring new languages and technologies.
I have also built some more small projects of C and C++.
I don’t have much experience of Fortran.
Interest in stdlib
JavaScript is often seen purely as a tool for building user interfaces, but its
role in data engineering and analysis is quickly evolving. With its growing
ecosystem and runtime versatility, it's becoming just as capable in backend and
scientific workflows. That’s why stdlib’s mission to strengthen JavaScript’s utility
across domains feels especially timely—and it’s a movement I’m excited to
contribute to.
While working on various contributions, I’ve always fascinated the datatype
array focused utilities from stdlib and have been consistently impressed by how
approachable and well-structured the library is. Even with limited exploration,
it’s evident that stdlib offers robust support for a broad range of development
needs.
What makes the experience even better is the sense of community. Clear
onboarding documentation, supportive maintainers, and an active contributor
network made getting involved not just easy but enjoyable. It’s helped me refine
my skills, expand my knowledge, and grow as a developer through real-world
collaboration.
Contributing to this project definitely made me a better programmer and I wish to learn and grow more!
Version control
Yes
Contributions to stdlib
o Adds C implementation:
○ #4388 (merged) - @stdlib/stats/base/dists/logistic/logpdf
o #4424 (merged) stdlib/stats/base/dists/laplace/logpdf
o #4437 (merged) @stdlib/stats/base/dists/laplace/quantile
o #4324 (merged) @stdlib/stats/base/dists/weibull/mode
o #4422 (merged) @stdlib/stats/base/dists/laplace/logcdf
o #4352 (merged) @stdlib/stats/base/dists/logistic/quantile
o #4440 (merged) @stdlib/stats/base/dists/laplace/cdf
o #4790 (open) - math/base/special/gammainc
o #4455 (open) - math/base/special/rising-factorial
o Refactor existing math/special /BLAS packages to follow current conventions:
o #4651 (merged) @stdlib/stats/base/dmeanvarpn
o #4648 (merged) @stdlib/stats/base/dmeanvar
o #4648 (merged) @stdlib/stats/base/dmeanstdevpn
o #4618 (merged) @stdlib/math/base/assert/is-finitef
o #4617 (merged) @stdlib/math/base/assert/is-finite
o #4615 (merged) @stdlib/math/base/assert/is-infinitef
o #4614 (merged) @stdlib/math/base/assert/is-nanf
o #4612 (merged) @stdlib/stats/base/dmeanstdev
o #4539 (merged) @stdlib/stats/base/dnanstdevch
o #4538 (merged) @stdlib/stats/base/dnanvariancepn
o #4537 (merged) @stdlib/stats/base/dnanstdevpn
o #4536 (merged) @stdlib/stats/base/dnanstdevyc
o #4511 (merged) @stdlib/stats/base/snanvarianceyc
o #4535 (merged) @stdlib/stats/base/dnanvarianceyc
o #4509 (merged) @stdlib/stats/base/sstdevpn
o #4508 (merged) @stdlib/stats/base/sstdevch
o #4507 (merged) @stdlib/stats/base/sstdev
o #4505 (merged) @stdlib/stats/base/snanvariancetk
o #4504 (merged) @stdlib/stats/base/snanvariancewd
o #4535 (merged) @stdlib/stats/base/dnanvarianceyc
o #4535 (merged) @stdlib/stats/base/dnanvarianceyc
+10 more merged PR’s from stats/base/*
o Add ndarray support:
o #4543 (open) @stdlib/stats/base/dvarmpn
o #4720 (open) @stdlib/stats/base/dmeanvarpn
o #4726 (open) @stdlib/stats/base/dmeanpn
o #4727 (open) @stdlib/stats/base/sdsnanmeanors
o Ideas: Proposed the following features/ideas (issues)
o #4889 (open) adds napi/create-bool
o #4635 (merged) adds napi/argv-bool
stdlib showcase
I have explored the stdlib repository through hands-on experimentation and by building educational and practical demos that highlight its capabilities. My university's mechanical teachers uses stdlib numerical functions for finding answers to difficult mathematical questions.
Goals
This project aims to introduce a dedicated string-typed array, called
StringArray, designed to support variable-length strings. The main motivation
behind adding this data type is to improve interoperability between JavaScript and
C. This is especially important for enabling support for ndarrays with string data
types, as a significant portion of ndarray iteration logic is implemented in C.
This project aims to add all the necessary string methods for the StringArray
with perfect error handling and all the necessary assert and other packages
necessary for StringArray.
Why this project?
Unlike numeric types, which have fixed sizes and integrate seamlessly into
typed arrays, strings present a more complex challenge because of their variable
length. This project tackles that issue by proposing a structured memory layout
tailored for string storage—offering both efficiency and clarity. It’s a critical step
toward enabling robust, high-performance handling of textual data in low-level
JavaScript environments.
Additionally I have prior knowledge on string methods and manipulation from
my string library project.
Additionally i have prior knowledge on string methods and manipulation from my string.h library project.
Qualifications
During my time in college, I’ve studied JavaScript alongside fundamental
computer science subjects such as object-oriented programming, algorithms,
operating systems, computer architecture, Linux, and Git. I’ve also developed a solid
understanding of the stdlib codebase and have spent a significant amount of time
researching existing string implementations to explore how these new features can
be effectively introduced.
Prior art
I have some prior knowledge of creating string methods and manipulation of string from my string library project in C.
After researching the implementation of string in libraries and languages like Numpy, Java, etc the below are their ways of implementing it:
###Numpy:
Numpy stores string data in UTF-8 sequence which takes 1-4 bytes of storage. They use Bjoern Hoerhmann’s DFA UTF-8 validator for validating UTF-8 sequence.
UTF-8 is a variable-width encoding, which means:
- ASCII Characters (U+0000 to U+007F): 1 byte each
- Latin/Greek/Cyrillic etc. (U+0080 to U+07FF): 2 bytes each
- CJK and other BMP scripts (U+0800 to U+FFFF): 3 bytes each
- Supplementary characters (U+10000 to U+10FFFF): 4 bytes each
Each string contains some metadata which are:
• NPY_STRING_MISSING: Whether the string is missing/null.
• NPY_STRING_INITIALIZED: Whether it's been initialized.
• NPY_STRING_OUTSIDE_ARENA: Whether it's stored outside the arena (arena is one of string implementation methods which is discussed later)
• NPY_STRING_LONG: Whether the string length is >255 bytes or not.
Empty strings are handled specially for efficiency.
The implementation uses thread safe memory management through mutex locks
In numpy each index of string data is stored in a memory block which contains size
variable and a *buffer
. The size
variable stores the size of data at *buffer (can be 1-4 bytes),
If the string method is of arena type then a cursor
variable is also present storing the current position in the arena
Numpy uses 3 methods for storing strings depending on the size of the string namely short string, arena and heap.
Short string:
• Strings who’s size is less then equal to 15 bytes(for 64 bit systems) or 7 bytes(for 32 bit system) uses this storage method.
• In this method string data is directly stored in the array buffer.
• It also stores all the flags and size in a single 2 byte variable called size_and_flags
, here the upper 4 bits stores the size and lower 4 bits stores the flags as mentioned before.
• When using methods, the methods identifies the storage method by the flags
**Arena**:
• Strings who’s size is greater then 15 bytes(for 64 bit systems) or 7 bytes(for 32 bit system) uses this storage method.
• The arena is a contiguous block of memory used to store single or multiple strings.
• Each block contains the string data
• It stores the staring address of the string and adds offset to it in order to access the string element
• It stores the size in the size_and_flags
variable.
• Arena method grows with a factor of 1.25.
• It works fine when we are working with multistring array as we only need to store the starting address of the string.
**Heap/Long string**:
• Strings who’s size is greater then 255 bytes lie here.
• it uses direct pointer to the heap memory rather than an offset into the arena.
• Since arena becomes inefficient after 255 bytes numpy uses heap.
###JAVA
Java uses 2 methods for storing string datatype. One is in heap memory and other is storing 2 copies which are stored within heap and a separate area inside heap dedicated for strings(SLC). Its implementation looks out of scope from our project implementation.
idea:
By taking Reference from the Complex64 array implementation and Boolean Array implementation one way can be to use Uint32Array for storing the Utf-8 sequence and storing 4 metadata. This approach is similar to numpy’s short string memory allocation method and will have similar project timeline and approach as of Boolean Array. StringArray prototype functions can be inspired by string library of C, general TypedArray methods and JavaScript native methods.
StringArray can inspire from numpy’s arena implementation and short string implementation, where we can allocate some additional free space which is 0.25 times the require length. The extra space will not be accessible to user until some data is stored in it and the length of array is greater then the index. This will decrease the chances of string data reallocation.
Ex: let a = new StringArray(“6_byte”); // this will initialize a string of length 8( ceil(1.25*6)) with only 0 to 5 index’s accessible for the user.
We will need to store 4 metadata namely size
, Length
, is_initialized
and is_null
as done by the numpy string array:
o size : (It can be removed after final discussion) This variable store the length of initialized Uint32Array.
o Length: This variable stores the length of accessible string.
o is_initialized: This variable is shows whether the string is initialized or not.
o Is_null: This variable shows whether the StringArray is an Empty Array.
StringArray’s ndarray implementation will be easier if we use use the Unit32Array as base.
It will have some static properties which are:
• BYTES_PER_ELEMENT: this stores the size of each memory block(4 bytes).
• name: this stores the string StringArray
It will have the following prototype functions :
• name: returns “StringArray”.
• byteLength: returns string length in bytes.
• byteOffset: returns offset (in bytes) of the array from the start of its underlying
• BYTES_PER_ELEMENT: returns each block’s size i.e 4 bytes.
• length: returns length of string.
• from( src , clbk, thisArgs] ): Creates a new StringArray from an array-like
object or Iterable get( index ). The clbk
is an optional callback function for each src element. thisArgs
is a context for clbk function which is also optional.
• of( src ): Creates a new StringArray from a variable number of arguments.
• map( callbackfn, thisArgs ): Returns a new array with each element being the
from the provided callback function. thisArgs
is the context for the callbackfn.
• get(index ): Returns the string data present at the provided index.
• set( value, index ): Sets the data at the provided index.
• indexOf( value ): Returns the index of the first occurence of value in the StringArray
• lastIndexOf( value ): Returns the index of the last occurence of value in the StringArray.
• toLowerCase(): Returns a StringArray whose every element is of lower case.
• toUpperCase(): Returns a StringArray whose every element is of upper case.
• Reverse(): Reverses the StringArray.
• Includes(value): Returns true if the value exists in the array and vice versa.
• startsWith( value ): Returns true if array starts with same values as in value and vice versa
• endsWith( value ): Returns true if array ends with same values as in value and vice versa
• slice( arg1 , arg2 ): Return StringArray from index arg1 till index arg2 from the parent StringArray.
• Substring(arg1, arg2): Return StringArray from index arg1 till index arg2 from
the parent StringArray. The only difference between substring and slice is that, if arg1 > arg2 then substring will return a string from index arg2 till arg1 while slice will return an empty array.
Many more methods will be added after final discussion.
We would need to add assert functions (is-stringarray and is-same-stringarray) in @stdlib/assert* similar to other TypedArray.
Ndarray implementation will be next where we need to add the StringArray support in the ndarray wapper and add the essential assert packages.
Lastly a well documented README.md file to be added showing the work , methods and information of the newly added package.
Commitment
My summer break starts on May 15, which means I’ll be fully available once the official coding period begins on May 27. For the first two months, I can dedicate over 40 hours per week to the project, as I won’t have any overlapping commitments during that time.
In the final month, when college resumes, I’ll still be able to contribute around 25 hours per week alongside my coursework. Altogether, I expect to commit time between 400 - 450 hours to the program—meeting the time expectations for the project comfortably.
Schedule
Assuming a 12 week schedule,
● Community Bonding Period:
○ Discuss and plan the proposed features in detail to gain more clarity on the goals and approach.
○ Once a clear roadmap is finalized, we can start early as my summer break would begin on May 15.
● Week 1, 2 & 3 :
○ Implementing StringArray constructor function with error handling
○ Writing test cases ,benchmark files and creating the Readme.md file with the current package data.
● Week 4 & 5:
○ Implementing easier prototype methods as discussed above with additional method finalized after discussion (i.e. indexOf, lastindexOf, get, set, toLowerCase, toUpperCase, includes, startsWith, endsWith, etc ).
○ Listing methods in the Readme with testing
● Week 6:
○ Implementing difficult prototype methods as discussed above with additional method finalized after discussion (i.e.map, etc).
(midterm): By midterm, we should be successfully done with most features indexOf, lastindexOf, get, set, toLowerCase, toUpperCase, includes, startsWith, map and endsWith with documentation , benchmark and test files.
● Week 7, 8 & 9:
○ Implement StringArray support in ndarray with test cases and benchmarks.
○ Implementing difficult prototype method with test cases and benchmarks.
● Week 10:
○ Add necessary packages in @stdlib/assert/* and @stdlib/array/base/assert/*.
○ Write tests for the added features and completing remaining work.
● Week 11:
○ Continue writing and finalizing tests and completing remaining work.
○ Write tutorials and documentation.
● Week 12:
○ handling pending work, bugs, tests etc.
● Final Week: Project submission!
Notes:
- The community bonding period is a 3 week period built into GSoC to help you get to know the project community and participate in project discussion. This is an opportunity for you to setup your local development environment, learn how the project's source control works, refine your project plan, read any necessary documentation, and otherwise prepare to execute on your project project proposal.
- Usually, even week 1 deliverables include some code.
- By week 6, you need enough done at this point for your mentor to evaluate your progress and pass you. Usually, you want to be a bit more than halfway done.
- By week 11, you may want to "code freeze" and focus on completing any tests and/or documentation.
- During the final week, you'll be submitting your project.
Related issues
● #44 - [Idea]: add support for string arrays in stdlib
Checklist
- I have read and understood the Code of Conduct.
- I have read and understood the application materials found in this repository.
- I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
- I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
- I have read and understood the stdlib showcase requirement which is necessary for my application to be considered for acceptance.
- The issue name begins with
[RFC]:
and succinctly describes your proposal. - I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.